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' This paper has three aims: (1) to generalize a computational account of the discourse 

. process called CENTERING, (2) to apply this account to discourse processing in Japanese so 
that it can be used in computational systems for machine translation or language under- 
Oh' standing, and (3) to provide some insights on the effect of syntactic factors in Japanese 
on discourse interpretation. We argue that while discourse interpretation is an inferential 
process, syntactic cues constrain this process, and demonstrate this argument with respect 
to the interpretation of ZEROS, unexpressed arguments of the verb, in Japanese. The syn- 
tactic cues in Japanese discourse that we investigate are the morphological markers for 
grammatical TOPIC, the postposition wa, as well as those for grammatical functions such 
as SUBJECT, ga, OBJECT, o and OBJECt2, ni. In addition, we investigate the role of 
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I speaker's empathy, which is the viewpoint from which an event is described. This is 
■ syntactically indicated through the use of verbal compounding, i.e. the auxiliary use of 
0^ , verbs such as kureta, kita. Our results are based on a survey of native speakers of their 
interpretation of short discourses, consisting of minimal pairs, varied by one of the above 
^\ . factors. We demonstrate that these syntactic cues do indeed affect the interpretation of 
""^jqI ZEROS, but that having previously been the TOPIC and being realized as a ZERO also con- 
tributes to the salience of a discourse entity. We propose a discourse rule of ZERO TOPIC 
^ |. ASSIGNMENT, and show that centering provides constraints on when a zero can be 
interpreted as the ZERO TOPIC. 

O 
> 

• 1—1 . 

' 1.1 Centering in Japanese Discourse 

^ . Recently there has been an increasing amount of work in computational linguistics in- 



1 Introduction 



volving the interpretation of anaphoric elements in Japanese (Vbshimoto, 1988; Kuno, 



1989; Walker, lida, and Cote, 1990; Nakagawa, 1992). These accounts are intended as 



components of computational systems for machine translation between Japanese and En- 
glish or for natural language processing in Japanese alone. This paper has three aims: 
(1) to generalize a computational account of the discourse process called centering 
(Sidner, 1979; Joshi and Weinstein, 1981; Grosz, Joshi and Weinstein, 1983; Grosz, Joshi 
and Weinstein, 1986), (2) to apply this account to discourse processing in Japanese so 
that it can be used in computational systems, and (3) to provide some insights on the 
effect of syntactic factors in Japanese on discourse interpretation. 

In the computational literature, there are two foci for research on the interpretation 
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of anaphoric elements such as pronouns. The first viewpoint focuses on an inferential 
process driven by the underlying semantics and relations in the domain (Hobbs, 1985a; 
Hobbs et al., 1987; Hobbs and Martin, 1987). A polar focus is to concentrate on the role 



of syntactic information such as what was previously the topic or subject (Hobbs, 1976b; 
Kameyama, 1985; Yoshimoto, 198^ ). We will argue for an intermediate position with 
respect to the interpretation of zeros, unexpressed arguments of the verb, in Japanese. 
Our position is that the interpretation of zeros is an inferential process, but that syntactic 



information provides constraints on this inferential process (Joshi and Kuhn, 1979; Joshi 



anc Wcinstein, 1981). We will argue that syntactic cues and semantic interp retation are 
mutually constraining ( Prince, 1981b| ; ^rmce, 1985| ; iHudson-D'Zmura, 198^ ). 

The syntactic cues in Japanese discourse that we investigate are the morphological 
markers for grammatical topic, the postposition wa, as well as those for grammatical 
functions such as subject, ga, object, o and OBJECt2, m. In addition, we investigate 
the role of speaker's empathy, which is the viewpoint from which an event is described. 
This can be syntactically indicated through the use of verbal compounding, i.e. the 
auxiliary use of verbs such as kureta, kita. 

In addition to the argument that a purely inference-based account does not consider 
limits on processing time, another argument against a purely inference-based account is 
provided by the minimal pair below. Here, the only difference is whether Ziroo is the 
subject or the object in the second utterance. Note that the interpretation of zeros is 
indicated in parentheses: 

(1) a. Taroo ga kooen o sanpositeimasita. 
Taroo SUBJ park in walking-was 
Taroo was taking a walk in the park. 

b. Ziroo ga hunsui no mae de mitukemasita. 
Ziroo SUBJ OBJ fountain of front in found 

Ziroo found (Taroo)in front of the fountain. 

c. kinoo no siai no kekka o kikimasita. 
SUBJ OBJ yesterday of game of scores OBJ asked 
(Ziroo) asked (Taroo) the score of yesterday's game. 



(2) a. Taroo ga kooen o sanpositeimasita. 
Taroo SUBJ park in walking-was 
Taroo was taking a walk in the park. 

b. Ziroo o hunsui no mae de mitukemasita. 
SUBJ Ziroo OBJ fountain of front in found 
(Taroo) found Ziroo in front of the fountain. 

c. kinoo no siai no kekka o kikimasita. 
SUBJ OBJ yesterday of game of scores obj asked 
(Taroo) asked (Ziroo) the score of yesterday's game. 

In lb and 2b, the syntactic position in which Ziroo is realized has the effect that Ic 
means Ziroo asked Taroo the score of yesterday's game, while 2c means Taroo asked Ziroo 
the score of yesterday's game. On the other hand, some purely syntactic accounts require 
that antecedents for zeros be realized as the grammatical topic, and thus cannot explain 



the above example because Taroo is never explicitly marked as the topic (Yoshimoto, 
1988| ). 
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In the literature, zeros are known as zero pronouns. We adopt the assumption of 
earlier work that the interpretation of zeros in Japanese is analogous to the interpre- 



tation of overt pronouns in other languages (Kuroda, 1965; Martin, 1976; Kameyama, 
198|^). Japanese also has overt pronouns, but the use of the overt pronoun is rare in nor- 
mal speech, and is limited even in written text. This is mainly because overt pronouns 
like kare ('he') and kanozyo ('she') were introduced into Japanese in order to translate 
gender-insistent pronouns in foreign languages (Martin, 1976). In this paper, we only 
consider zeros in subcategorized-for argument positions. Since Japanese doesn't have 
subject or object verb agreement, there is no syntactic indication that a zero is present 
in an utterance othe r th an information from subcategorization.^ 

First, in section L2 we describe the methodology that we applied in this investiga- 
tion. In section eL we present the theory of Centering and some illustrative examples. 
Then, in section ^ we discuss particular aspects of Japanese discourse context, namely 
grammatical topic and speaker's empathy. We will show how these can easily be incor- 
porated into a centering account of Japanese discourse processing, and give a number of 
examples to illustrate the predictions of the theory. We also discuss the way in which a 
discourse center is instantiated in section ^ 

In section || we propose a discourse rule of zero topic ASSIGNMENT, and use the 
centering model to formalize constraints on when a zero may be interpreted as a zero 
topic. Our account makes a distinction between two notions of topic, grammatical 
topic and zero topic. The grammatical topic is the wa-marked entity, which is by default 
predicted to be the most salient discourse entity in the following discourse. However 
there are cases in which it may not be, depending on whether zero topic assignment 
applies. This analysis provides support for Shibatani's claim that the interpretation of 
the topic marker, wa, depends on the discourse context (Shibatani, 1990). zero topic 
assignment actually predicts ambiguities in Japanese discourse interpretation and pro- 
vides a mechanism for deriving interpretations that previous accounts claim would be 
unavailable. 

We delay the review of related research to section ^ when we can contrast it with 



our account. The two major previous accounts are those of Kuno (Kuno, 1972; Kuno, 



1976b; Kuno, 1987; Kuno, 1989) and Kameyama (Kameyama, 1985; Kameyama, 1986; 
Kameyama, 1988). Finally in section ^, we summarize our results and suggest topics for 
future research. 



1.2 Methodology 

Most of the examples in this paper are constructed as four utterance discourses that fit 
one of a number of structural paradigms. In all of the paradigms, a discourse entity is 
introduced in the first utterance, and established by the second utterance as the center, 
what the discourse is about. The manipulations of context occur with the third and the 
fourth utterances. In each case the zero in the third utterance cospecifies the entity 
already established as the center in the second utterance. The fourth utterance consists 



1 When zero pronouns should be stipulated is still a research issue. For example, (Hasegawa, 1984) 
described a zero pronoun as a pVinnetirally null elpmpnt in an ar giimpnt position. However, as shown 
in the following example, (Terazu, Yamanasi, and Inada, 1980) assumed that zero prnnn nns are not 
limited in their distribution and stipulated them in adjunct positions as well ([ida, 1992). 

Taroo wa Hanako no kaban o mitukemasita. 

Taroo TOP/SUBJ Hanako GEN bag 0B,I found 

Taroo found Hanako 's bag. 

tanzyoobi no purezento o iremasita. 
birthday GEN present OBJ put 

(Taroo) put a birthday present (in her bag). 
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of a potentially ambiguous sentence containing two zeros. The variations in context are 
as shown below: 



Third Utterance 



Fourth Utterance 



SUBJECT 


object(2) 


SUBJECT 


OBJECt(2) 


EXAMPLES 


zero 


NP(o or ni) 


zero 


zero 




zero 


NP(o or ni) 


zero 


zero, empathy 




36 




NP(ga) 


zero 


zero 


zero 


w 


I 34 




NP(wa) 


zero 


zero 


zero 




NP(ga) 


zero 


zero 


zero, empathy 




35 





Thus we are manipulating factors such as whether a discourse entity is realized in 
subject or object position in the third utterance, whether a discourse entity realized 
in subject position is ^a-marked or wa-marked in the third utterance, and whether a 
discourse entity realized in the fourth utterance in object position is marked as the locus 
of speaker's empathy. 

We collected a group of about 35 native speakers by solicitation on the net to provide 
judgements for most of the examples given in this paper. These native speakers were 
readers of the newsgroups sci.lang.japanese and comp. research .japan. They were thus 
typically well-educated, bilingual engineers. Whenever an example was tested in this 
way, we will provide the number of informants who chose each possible interpretation to 
the right of the example. Some examples that are included for expository reasons were 
not tested. 

Participation in our survey was completely voluntary and the data was collected over 
3 surveys. Thus the numbers of subjects varied from one survey to another and this is 
reflected in the numbers accompanying our examples. This data collection was carried 
out on written examples using electronic mail in a situation in which the informants 
could take as long as they wanted to decide which interpretation they preferred. The 
instructions sent with the surveys are given in the appendix. 

This paradigm clearly cannot provide information on which interpretation a subject 
might arrive at first and then perhaps change based on other pragmatic factors, and 
thus it contrasts with reaction time studies. However the judgements given should be 
stable, and reflect the fact that our informants were able to use all the information in the 
discourse. It is a useful paradigm given that we are exploring the correlation of syntactic 
cues and discourse interpretation. It has been claimed that syntactic cues are only used 
in automatic processing and can be over-ridden by deeper processing. However Hudson's 
results suggest that subjects may judge a discourse sequence to be nonsensical when it 
is incoherent according to centering (Hudson-D'Zmura, 1988), ChapS. Di Eugenio claims 
that discourse sequences in Italian that are not discourse coherent according to centering 
theory produce a garden-path effect (Di Eugenio, 1990). The methods we used allow 
us to explore the results of these interactions, and yet it would be beneficial for these 
results to be expanded upon by careful psychological experimentation (Hudson-D'Zmura 
anc Tanenhaus, in press). 

For most of the examples reported here, we asked subjects to choose one preferred 
interpretation instead of allowing them to rank interpretations. The motivation for doing 
this was to force differences to come out for slight preferences, with the theory being that 
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other variations would come out across subjects. In a few cases we allowed subjects to 
indicate no preference; these examples will be clearly indicated. 

In addition, we used the same gender for multiple discourse entities to prevent any 
tendency for judgements to be influenced by gender stereotypes. We also avoided using 
verbs with causal biases toward one of their arguments, and we used few cue words 
such as but, because, then, which could result in a bias towards, say, a cause-effect or 
temporal sequence of events interpretation. We also omitted honorific markers, which are 
normally a part of Japanese ambiguity resolution.^ This was done to isolate the effects 
of the variables that we were exploring in this study, namely topic marking, grammatical 
function, empathy, and realization with a zero or with a full noun phrase. 



2 Centering Theory 



Within a theory of discourse, centering is a com putational model of the process b y 
which conversants coordinate attention in discourse ( Grosz, Joshi, and Weinstein, 1986 ) 



Cent e ring has its computational foundation s in the work of Grosz and Sidner ( Grosz, 



1977 ; Sidner, 197£ ; Grosz and Sidner, 1985) and w a s further developed by Grosz, Joshi 
and Weinstein (Grosz, Joshi, and Weinstein, 1983| ; Grosz, Joshi, and Weinstein, 1986 ; 
Joshi and Weinstein, 1981 ). Centering is intended to reflect aspects of attention AL 
STATE in a tripartite view of discou rse structure that also i ncludes intentional struc- 
ture and LINGUISTIC structure ( Grosz and Sidner, 1986| ). In Grosz and Sidncr's theory 
of discourse structure, discourses can be segmented based on intentional structure and a 
discourse segment exhibits both local and global coherence. Global coherence depends on 
how each segment relates to the overall purpose of the discourse; local coherence depends 
on aspects such as the syntactic structure of the utterances in that segment, the choice 
of referring expressions, and the use of ellipses, centering models local coherence and 
is formalized as a system of constraints and rules. Our analysis uses an adaptation of 
a centering algorithm that was developed by Brennan, Friedman and Pollard, based on 
these constraints and rules (Brennan, Friedman and Pollard, 1987), (Walker, 1989). 

The purpose of centering as part of a computational model of discourse interpreta- 



tion is to model ATTENTIONAL state in discourse in order to control inference ( loshi and 
Kuhn, 197£; Joshi and Weinstein, 1981).^ Our approach to modeling attentional state 
is to explore aspects of the correlation between syntax and discourse function. This as- 
sumes that there are language conventions about discourse salience and that conversants 

attempt to mai ntain a sense of shared context. 

Section 2T presents the centering rules and constraints. Section p.2| and 2.3 illustrate 
the theory and the definitions with a number of examples. Section |2.4| discusses the 
centering algorithm for the resolution of zeros in Japanese. 



While native speakers understandably found some of these examples "stilted" or "awkward" , they 
were still able to give their judgements based on the information that was provided in the discourses. 
Recent work in situation theory proposes to control computation with a similar notion of 
background i nformation in terrn s of constants of the situation that thus are not explicitly realized in 
an utterance (Makashima, 1990). The situation-theoretic work does not as yet distinguish shared 
knowledge that, determines discourse salience and derives from the discourse coritext and the wav 



utterances are expressed ( Clark and Haviland, 1977 ; 



Clark and Marshall, 1981 



Prince, 1981b) from 



shared knowledge that is n art ot p^eneral hackp^rounrl knowledp-e such as culturaJ assnmntiori.s' 



JErince, 1978a; 

iwnr 



Joshi, 1982) or shared knowledge that might derive from the task context (Grosz 
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2.1 Rules and Constraints 

The centering model is very simple. Each utterance in a discourse segment has two 
structures associated with it. First, each utterance in a discourse has associated with it 
a set of discourse entities called forward-looking centers, Cf. Centers are semantic 
entities that are part of the discourse model. Second, there is a special member of this 
set called the backward-looking center, Cb. The Cb is the discourse entity that the 



utterance most centrally concerns, what has been elsewhere called the 'theme' (Eleinhart, 
198|l|; pJorn, 19"86| ). The Cb entity links the current utterance to the previous discourse 

The set of forward-looking centers, Cf, is ranked according to discourse salience. 
We will discuss factors that determine the ranking below. The highest ranked member 
of the set of forward looking centers is referred to as the preferred center, Cp.0 The 
PREFERRED CENTER represents a prediction about the Cb of the following utterance. 
Sometimes the Cp will be what the previous segment of discourse was about, the Cb, 
but this is not necessarily the case. This distinction between looking back to the pre- 
vious discourse with the Cb and projecting preferences for interpretation in subsequent 
discourse with the Cp is a key aspect of centering theory. 

In addition to the structures for centers, Cb and Cf, the theory of centering specifies 
a set of rules and constraints. Constraints are meant to hold strictly whereas rules may 
sometimes be violated. 

• CONSTRAINTS 

For each utterance Ui in a discourse segment Ui, . . . , Um: 

1. There is precisely one backward looking center Cb. 

2. Every element of the forward centers list, Cf(Ui), must be realized in 
Ui. 

3. The center, Cb(Ui), is the highest-ranked element of Cf(Ui_i) that is 
realized in Ui.^ 

Constraint (1) says that there is one central discourse entity that the utterance is 
about, and that is the Cb. The second constraint depends on the definition of realizes. 
An utterance U realizes a center c if c is an element of the situation described by U, or c 



is the semantic interpretation of some subpart of U ( Grosz, Joshi, and Weinstein, 1986 ) 



Thus the relation realize describes zeros, explicitly realized discourse entities, and those 



implicitly realized centers that are entities inferable from the discourse situation (Prince, 



1978a; Prince, 1981b) 



A specialization of the relation realize is the relation directly realize. A center 
is directly realized if it corresponds to a phrase in an utterance. We restrict our focus to 
entities realized by noun phrases, however it is clear that propositions can be centers, so 
we assume that the account given here can be extended to propositional entities as well 



( [Webber, 1978| ; ^idner, 19791 ; [Prince, 1978b| ; [Ward, 1985[ ; [Prince, 1986[ ). 

As we discuss further in section ^ zeros refer to entities that are already in the 
discourse context. The fact that the current utterance realizes one or more zeros 
follows from information specified in the subcategorization frame of the verb. These 
arguments must be interpreted and thus acquire a degree of discourse salience that 
nonsubcategorized-for discourse entities lack. 



4 The notion of preferred center corresponds to Sidner's notion of EXPECTED FOCUS (Sidner, 1983) 

5 This could po ssibly ho. reph rased as; Assume the Cp(Ui_i is the Cb(Ui) unless there is evidence to 
the contrary (Carter, 1987 
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Constraint (3) stipulates that the ranking of the forward centers, Cf , determines from 
among the elements that are realized in the next utterance, which of them will be the Cb 
for that utterance. If the preferred center, Cp(Ui), is reahzed in Ui+i, it is predicted 
to be the Cb(Ui+i). We will use the following forward center ranking for Japanese:^ 

(grammatical or zero) topic > EMPATHY > SUBJECT > OBJECT2 
> OBJECT > OTHERS 

Backward-looking centers, Cbs, are often deleted or pronominalized and some transi- 
tions between discourse segments are more coherent than others. According to the theory 
of centering, cohere nce is measured by the hearer's inference load when interpret ing a 
discourse sequence ( Joshi and Weinstein, 1981 ; Grosz, Joshi, and Weinstein, 1986| ). For 



instance, discourse segments that continue centering the same entity are more coher- 
ent than those that repeatedly shift from one center to another. These observations are 
encapsulated in two rules: 

• RULES 

For each Ui in a discourse segment Ui, . . . , Ui„: 

1. If some element of Cf(Ui_i) is realized as a pronoun in Ui, then so is 
Cb(Ui). 

2. Transition states are ordered, continue is preferred to retain is 
preferred to SMOOTH-SHIFT is preferred to rough-SHIFT.[| 

Rule (1) captures the intuition that pronominalization is one way to indicate dis- 
course salience. It follows from Rule (1) that if there arc multiple pronouns in an utter- 
ance, one of these must be the Cb. In addition, if there is only one pronoun, then that 
pronoun must be the Cb. For Japanese, we extend this rule directly to zeros, assuming 
that zeros in Japanese correspond to destressed pronouns in English. 

Rule (2) states that modeling attentional state depends on analyzing adjacent ut- 
terances according to a set of transitions that measure the coherence of the discourse 
segment in which the utterance occurs. Measuring coherence is based on an estimate 
of the hearer's inference load, but this measure must always relative since there is no 
grammar of discourse. Thus methods for exploring these issues must use comparative 
measures of how some discourses are easier to process than others. Centering Theory 
models this by stipulating that some transitions are preferred over others. 

The typology of transitions from one utterance, Ui, to the next, is based on two 
factors: whether the backward-looking center, Cb, is the same from Ui_i to Ui, and 
whether this discourse entity is the same as the preferred center, Cp, of Uij^ 

1. Cb(Ui) = Cb(Ui_i), or there is no Cb(Ui_i) 

2. Cb(Ui) = Cp(Ui) 



6 This ranking is r.nn si stent with Kimn's Empathy Hierarchies and with Kameyama's Expected 
Center Order (Kuno, 1987; Kameyama, 19851 Kameyama, 198J ). This will be discussed in sections. 



We do not include discourse entities tor 



■ ^^^erT^ phrases or nther pvon 
since we have not studied their cnntrihiit^nn r.iit Qir»e l^^irlnor 1Q7C| 



idllC£_-1^81 



f 



PS in this rarikin 



Carter, 1987 



Smooth-shift was called shifting-1 by (Bronnan, Friedman, and Pollard, 1987 

It is possible that restricting the relation between the Cb(Ui) and the Cb(Ui_i) to be coreference 
(equality) may be too strong. Future work should examine the role of shifts to functionally 
dependent entities or entities related by poset relations to the previous Cb. 
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If both (1) and (2) hold then we are in a continue transition. The continue 
transition corresponds to cases where the speaker has been talking about a particular 
entity and indicates an intention to continue talking about that entity.^ If (1) holds but 
(2) doesn't hold then we are in a retain transition, retain corresponds to a situation 
where the speaker is intending to shift onto a new entity in the next utterance and 
is signalling this by realizing the current center in a lower ranked position on the Cf 
(examples follow below). 

If (1) doesn't hold then we are in one of the shift states depending on whether or 
not (2) holds. This definitio n of transition states is summarized in Figure ^ ( Brennan, 



Fri edman, and Pollard, 1987 ). We will use the notation of Cb(Ui_i) = [?] for cases where 
there is no Cb(Ui_i). Section ^ will discuss center instantiation. 





Cb(Ui) = Cb(Ui_i) 
OR Cb(Ui_i) = [?] 


Cb(Ui) ^ Cb(Ui_i) 


Cb(Ui) = Cp(Ui) 


CONTINUE 


SMOOTH-SHIFT 


Cb(Ui) ^ Cp(Ui) 


retain 


ROUGH-SHIFT 



Figure 1 

Centering Transition States, Rule 2 

KEY 

backward-looking center = Cb 

PREFERRED CENTER = Cp 

Uninstantiated Cb = [?] 



The combination of the constraints, rules and transition states makes a set of testable 
predictions about which interpretations hearers will prefer because they require less pro- 
cessing. For example, maximally coherent segments are those that require less processing 
time. A sequence of a continue followed by another continue should only require the 
hearer to keep track of one main discourse entity, which is currently both the Cb and 
the Cp. A single pronoun in an utterance is the current Cb (by Rule 1) and can be 
interpreted to cospecify the discourse entity realized by Cp(Ui_i) in one step(Constraint 
3). 

The ordering of the Cf is the main determinant of which transition state holds be- 
tween adjacent utterances. This means that the predictions of the theory are largely 
determined by the ranking of the items on the Cf . But there are many factors that can 
contribute to the salience of a discourse entity; among them are factors that we will not 
examine here such as lexical semantics, intonation, word-order, and tense.^In this paper 
we explore the influence of various syntactic factors, which we discuss in detail in section 
^. We will also examine the relative contribution of pronominalization and postposition 
marking in section ^. We postulate that the Cf ordering will vary from language to lan- 
guage depending on the means the language provides for expressing discourse function. 



9 A prediction made by the preference for CONTINUE is that intersentential antecedents for pronouns 
will be preferred over intrasentential candidates. This preference i« nnp that. Histjngnigript! ripritprincr 



for pronoun interpretation from the proposal made by Hobbs in (Hobbs, 1976b; Hobbs, 1976E) 



However this preference needs to be constrained further by the fact that sortal filters may rule out 
the Cp of the previous utterance as the currnnt Ch In this cagp the data suggests that perhaps 



intrasentential candidates should be„ 

of Sidnpr't! ttipory of Inral fnnigino; 



Carter explored this in his extension 



10 See ( Hudson-D'Zmura, 1988) for an examination of the role of lexical semantics in Centering 
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However much of this variation can be captured in the ranking of the Cf due to the 
modularity of the theory. 



In sections 2.2 and 2.3 we will present some simple examples to motivate these 
definitions. In section 2.4 we will present a slightly modified version of the centering 
algorithm (Brennan, Friedman, and Pollard, 1987). In the following discussion we assume 
that the centering rules and constraints, and the notion of centering transition states have 
some cognitive reality (Brennan, ; Hudson-D'Zmura, 198^ ; Gordon, Grosz, and Gilliom, 



199^; [Hudson-D'Zmura and Tancnhaus, in prcs£ ). However we make no claims about the 



cognitive reality of the centering algorithm that we discuss in section 2.4 



2.2 The Distinction between Continue and Retain 

This theory predicts preferences in the interpretation of utterances whose meaning de- 
pends on parameters from the discourse context. Thus if there are still multiple possibil- 
ities for interpretation after the application of all constraints and rules, the ordering on 
transitions applies, and continue interpretations are preferred (Rule 2). Indeed, many 
cases of the preference for one interpretation over another follow directly from the dis- 
tinction between the transition states of continue and retain. Let us look at a simple 
example. In the discourse segment in 3: the zero in the second sentence is understood 
as referring to Taroo, and not to Hanako. Remember that the interpretation of zeros is 
indicated with parentheses. 



(3) 



Taroo wa Hanako o eiga ni sasoimasita. 

Taroo top / SUB j Hanako OBJ movie to invited 
Taroo invited Hanako to the movie. 



Cb: taroo 

Cf: [taroo, hanako] 



itiniti-zyuu nani mo te ni tukimasendesita. 
SUBJ all-day anything even hand to attached-not 
(Taroo) could not do anything all day. 



Cb: 
Cf: 



TAROO 
fTAROOl 



In example 3, the Cf from 3a contains the discourse entity for Taroo as the first 
element and for Hanako as the second element. When the unexpressed argument is in- 
terpreted in 3b, the information from this Cf is used. Because the zero subject may 
REALIZE either Taroo or Hanako, both Constraint 3 and Rule 1 would be obeyed with 
either interpretation.]^ However by interpreting the zero as Taroo, Taroo is the Cb, and 
it is possible to get a preferred CONTINUE interpretation Taroo could not do anything all 
day. In this interpretation, Taroo is both the Cb(3b) and the Cp(3b). 



2.3 The Distinction between Smooth-Shift and Rough-Shift 

In example 4, we illustrate the difference between the transition states of rough-SHIFT 
and SMOOTH-SHIFT. Remember that rough-SHIFT is claimed to be less coherent than 



SMOOTH-SHIFT ( Brennan, Friedman, and Pollard, 1987 ). In both cases the speaker has 
shifted the center to a different discourse entity. However in the SMOOTH-SHIFT transi- 
tion state, the speaker has indicated an intention to continue talking about the recently 
shifted-to entity by realizing that entity in a highly ranked Cf position such as subject, 



11 The hypothesis that wa in 3a instantiates Taroo as the Cb will be discussed in section 
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whereas no such indication is available with the rough-SHIFT transition. The numbers 
shown to the right of an interpretation correspond to how many native speakers preferred 
that interpretation. 

(4) a. Taroo ga kooen de hon o yondeimasita. 
Taroo SUBJ park at book OBJ reading-was 
Taroo was reading a book in the park. 



Cb: 


[?] 




Cfl: 


[taroo, 


book] 




SUBJ 


OBJ 



b. koora o kai ni baiten ni hairimasita. 
SUBJ cola OBJ buy to shop into entered 
(Taroo) entered a shop to buy a cola. 



Cb: 


TAROO 






Cfl: 


[taroo, 


cola] 


CONTINUE 




SUBJ 


OBJ 





c. Ziroo wa sokode guuzen dekuwasimasita. 

Ziroo TOP/SUBJ OBJ there by chance met 
Ziroo met (Taroo) there by chance. 



Cb: 


TAROO 






Cf: 


[ziroo, 


taroo] 


RETAIN 




TOP 


OBJ 





d. eiga ni sasoimasita. 
SUBJ OBJ movie to invited. 
(Ziroo) invited (Taroo) to a movie. 



Cb: 


ZIROO 








Cfl: 


[ziroo, 
subj 


taroo] 
obj 


SMOOTH-SHIFT 


32 


Cf2: 


[taroo, 

SUBJ 


ZIROO ] 
OBJ 


ROUGH-SHIFT 


2 



In example 4, the use of topic marking in the phrase Ziroo wa of utterance (c) means 
that (c) is interpreted as a retain.]^ Ziroo becomes the most highly ranked discourse 
entity for c, although Taroo is the Cb since Taroo was most highly ranked for utterance 
(b) (by Constraint 3). Then when we apply the Centering algorithm in (d), there are two 
candidates for the Cb(d) from the Cf(c), both Ziroo and Taroo. However this time when 
constraint 3 applies, stipulating that the Cb must be the highest ranked element of Cf(c) 
realized in 4d, Ziroo must be the highest ranked entity realized, and therefore must be 
the Cb. At this point it is clear that some kind of shift is forced by the application of 
constraint 3. The two candidates are a SMOOTH-SHIFT and a ROUGH-SHIFT. The SMOOTH- 
SHIFT interpretation corresponds to the reading Ziroo invited Taroo to a movie whereas 
the ROUGH-SHIFT interpretation corresponds to the Taroo invited Ziroo reading. The 
SMOOTH-SHIFT interpretation is more highly ranked, thus considered more coherent and 
so is the preferred interpretation(Z =10.93, p < .001). 



12 It has al sn hppn rlairn prl that Symmetric verbs such as meet by chance mark EMPATHY on the 
subject (|Kuno, 1976a|). 
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2.4 The Centering Algorithm 

The CENTERING ALGORITHM that was proposed by Brennan, Friedman and Pollard in- 
corporates the centering rules and constraints in addition to contra-indexing constraints 
on coreference (Reinhart, 197(;; Brennan, Friedman, and Pollard, 1987; [ida, 199^ ). These 
contra-indexing constraints specify that in a sentence such as He likes him, that he and 
him cannot co-specify the same discourse entity. The algorithm applies Centering theory 
to the problem of resolving anaphoric reference. Application of the algorithm requires 
three basic steps. 

1. GENERATE possible Cb-Cf combinations 

2. FILTER by constraints, e.g. contra-indexing, sortal predicates, centering rules 
and constraints 

3. RANK by transition orderings 

In order to apply this algorithm to Japanese, possible Cb-Cf combinations, (gen- 
erate step 1), must be constructed from the surface string and information from the 
subcategorization frame of the verb. First the verb subcategorization is examined, and if 
there are more entities than appear in the surface string, zeros are postulated as forward 
centers. These zeros are then treated just like pronouns in English by the rest of the al- 
gorithm. We use a different ranking for the Cf for Japanese than for English, but this has 
no effect on the actual algorithm itself since the Cf ranking is a declarative parameter. 

The steps of the algorithm can be interleaved to improve computational efficiency 



(Brennan, Friedman, and Pollard, 1987). Some simple modifications are 



• Never propose a Cf that violates linguistic constraints on contra-indexing. (In 
other words, apply the contra-indexing filter as early as possible to avoid Cb-Cf 
combinations that will be eliminated by that filter.) 

• If there are pronouns in an utterance, only propose pronouns as possible Cbs. 
(Collect the pronouns from the proposed Cfs as Cbs, from Rule 1) 

In addition, it is simple to add additional filters to step (2) of the algorithm. For 
instance, any constraint that is lexically specified such as [ianimacy] can be easily applied 
as a filter. It is also possible to pursue a 'best first' strategy by interleaving steps (1), (2) 
and (3) so that a continue will be found without extra processing if one exists. 

In example 5, we illustrate in more detail how the steps of the algorithm work and 
the difference between continue and retain. Each utterance shows what the Cb and Cf 
would be for that utterance. We will mostly be concerned with the process of resolving 
the two zeros in utterance 5c. 



(5) a. Taroo wa saisin no konpyuutaa o kaimasita. 

TOP / SUB J newest of computer OBJ bought 
Taroo bought a new computer. 



Cb: 
Cf: 



TAROO 

[taroo. 



COMPUTER 



John ni sassoku sore o misemasita. 
SUBJ John OBj2 at once that OBJ showed 
(Taroo) showed it at once to John. 



Cb: 


TAROO 




Cf: 


[taroo, JOHN, computer] 


CONTINUE 
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c. atarasiku sonawatta kinoo o setumeisimasita. 
SUBJ OBj2 newly equipped function OBJ explained 
(Taroo) explained the newly equipped functions to (John). 



Cb: 
Cfl: 


TAROO 

[taroo, 

SUBJ 


JOHN] 
OBJ 


CONTINUE 




27 


Cf2: 


[JOHN, 
SUBJ 


taroo] 

OBJ 


RETAIN 




1 


Cf3: 


[JOHN, 
SUBJ 


JOHN ] 
OBJ 


CONTRA-INDEX 


FILTER 




Cf4: 


[taroo, 

SUBJ 


taroo] 

OBJ 


CONTRA-INDEX 


FILTER 





Example 5(c) has explained as the main verb, which requires an animate subject 
and object2. Since there are two animate zeros in 5c, which are also contra-indexed 
by syntactic constraints, both Ziroo and Taroo must be realized in 5c. Constraint (3) 
restricts the Cb to Taroo as the highest ranked element from the Cf(5b). The interpretive 
process must also generate the possible candidates for the Cf. If no constraints applied, 
then all 4 candidates shown above as Cfl, Cf2, Cf3, and Cf4 would be possible. However 
the contraindcxing filter will rule out CfS and Cf4. As mentioned above, there is no reason 
that these filters cannot be applied at the generate phase rather than later on. 

The only CONTINUE interpretation available, Taroo explained the newly equipped 
functions to John, corresponds to the forward centers Cfl. It is a CONTINUE interpretation 
because Cb(5c) = Cb(5b) and also Cb(5c) = Cp(5c). The retain interpretation is less 
preferred and is defined by the fact that Cb(5c) = Cb(5b), but Cb(5c) ^ Cp(5c). This 
example supports the claim that a continue is preferred over a retain(Z = 13.24, p < 
.001). 

In order to find this preferred continue interpretation in a 'best first' fashion, Taroo 
as the Cp(Ui_i) would be tried first as the Cb(Ui), and as the interpretation for the 
subject. Contraindcxing rules out Taroo as the object, so John would be tried next as 
the object. 

In the next section, we examine further the application of centering to the inter- 
pretation of zeros in Japanese. We will examine the ranking of forward centers that we 
have adopted for Japanese and explain how this is partially determined by the way the 
Japanese language allows a speaker to express discourse functions. We will also give some 
examples of the interpretation of zeros in cases involving Japanese discourse markers for 

TOPIC and EMPATHY. 



3 Centering in Japanese 



The theory of centering is a formal specification that is intended to model attentional 
state and is defined by the rules and constraints given in section 2.1. Attentional state 
in turn constrains the discourse participant's interpretation process; one aspect of atten- 
tional state is the notion of discourse salience. In the centering model, the ordering of the 
forward centers is an approximation of discourse salience. This in turn is the main deter- 
minant of discourse interpretation processes such as the resolution of zeros in Japanese. 
A crucial question then is what discourse factors must be considered to determine the 
ordering of the forward centers, Cf , in Japanese discourse. 

Being a subject has been shown to be an important factor for English; this is re- 



fleeted in a Cf ordering by grammatical function ( 


Prince, 1981b; 


Brennan, Friedman, and 


Pollard, 1987 




Hudson-D'Zmura, 198?; 


Brennan, 


) . Aspects of surface order may also af- 
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feet the interpretation (Di Eugenio, 199C; Hajicova and Vrbova, 198^ ). An interpretation 



algorithm can also use pronominalization as an indicator of what the speaker believes 



is salient (Grosz, Joshi, and Weinstein, 1986). Furthermore, zeros in Japanese are not 



realized syntactically so that there must be a way to distinguish zeros from other entities 
inferred to be part of a discourse situation. Consider: 

(6) Taroo ga aimasita. 
Taroo SUBJ OBj2 met 
Taroo met (0). 

This sentence is not felicitous unless the addressee has already been given some in- 
formation about the person that Taroo met, either in the current discourse or in previous 
discourses. In contrast, nonsubcategorized-for arguments like adjuncts are not necessarily 
given a specific interpretation, but rather a non-specific one. 

(7) Taroo ga Hanako ni aimasita. 
Taroo SUBJ Hanako OBj2 met 
Taroo met Hanako. 

The sentence means that Taroo met Hanako at some time in some place: the temporal- 
location of the meeting situation need not be specified. The speaker can utter this sen- 
tence even if the addressee does not know where and when Taroo met Hanako. Thus, 
in this work, we only represent obligatorily subcategorized arguments of the verb on the 
Cf, assuming that the salience of discourse entities is partially determined by virtue of 
filling a verb's argument role, and the information from the subcategorization frame is 
used to determine that a zero is present in an utterance. 

Zeros are then interpreted with reference to the current context. Prince has proposed 



that the current context should be categorized by ASSUMED familiarity (Prince, 1981b 
Horn, 198^ ), with a concomitant goal of determining the correlation between the use of 



certain linguistic forms and the types of assumed familiarity. The first division of assumed 
familiarity is into the subtypes of new, inferable and evoked, new can be divided 
into BRAND-NEW, discourse entities that are both new to the discourse and new to the 
hearer, and UNUSED, discourse entities old to the hearer but new to the discourse. The 
information status of evoked can be further divided into textually evoked, old in the 
discourse and therefore old to the hearer as well, and SITUATIONALLY evoked, entities 
in the current situation, inferables are technically both hearer-new and discourse-new 
but depend on information that is old to the hearer and the discourse, and are often 
treated by speakers as though they were both hearer-old and discourse-old. There is a 
hierarchy of assumed familiarity in terms of discourse salience: 

Assumed Familiarity Hierarchy (Prince 1981): 

TEXTUALLY EVOKED > SITUATIONALLY EVOKED > INFERABLE > UN- 
USED > BRAND-NEW 

Zeros typically refer to evoked entities]^ but there is a scale of relative salience 
among the evoked entities. In our theory this is modeled with Cf ranking. We repeat 
the proposed ranking of the Cf here and justify it in the following sections :p^ 



13 Under certain circumstances that we cannot explore here, it appears that zeros can at times be used 
to refer to inferable or unused entities, just as pronouns in EngUsh sometimes can be. 

14 This ranking resembles Kuno's Empathy Hierarchy and Kameyama's Expected Center Order, but 
we distinguish two kinds of TOPIC and we posit that OB.IECT2 is more salient than OBJECT. We 
continue Kuno's use of the term empa thy tn represent th e empathy LOCUS, whereas Kameyama 
used the property IDENT for empathy ([Kameyama, 198q). 
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Cf Ranking for Japanese 

(grammatical or zero) topic > EMPATHY > SUBJECT > OBJECT2 
> OBJECT > OTHERS 

The relevance of the notions of topic and speaker's empathy to centering is that a 
discourse entity realized as the topic or the empathy LOCUS is more salient and should 
be ranked higher on the Cf. Whenever a discourse entity simultaneously fulfills multiple 
roles, the entity is usually ranked according to the highest ranked role. 



In the following sections we will discuss the motivation for this ranking. Section 3.1 



discusses the role of the grammatical topic marker wa in Japanese. Section 3.2 explains 



the role of empathy in Japanese discourse salience and shows that (grammatical 



OR zero) TOPIC > empathy and that empathy > subj. Section |3.2.l| shows how 
the centering algorithm handles utterances with empathy loci. Zero topics will not be 
discussed until section ||. 

3.1 Topic 

Discourse entities that are evoked, inferable or unused can be marked as the TOPIC. 
The speaker cannot mark an entity as the grammatical TOPIC unless the hearer is aware 



of the object that s/he is going to talk about (Prince, 1978a; Kuno, 1976b). For example: 



(8) Hutari wa paatii ni kimasita. 
two-person top/subj party to came 
Speaking of two persons, they came to the party. 

Example 8 is felicitous only when hutari ('two persons') is understood as meaning 
the two people under discussion. The sentence never means that the people who came to 
the party numbered two. 

The fact that the zwa-marked entity should be discourse-old is also shown by the fact 
that a wh-question cannot be answered with a wa-markcd np. 

(9) a. Dono hito ga Ziroo o bengosimasita ka. 

which person subj Ziroo OBJ defended Q 
Which person defended Ziroo? 

b-1. Taroo ga Ziroo o bengosimasita. 
Taroo SUBJ Ziroo OBJ defended 
Taroo defended Ziroo. 

b-2. *Taroo wa Ziroo o bengosimasita. 

Taroo top/subj Ziroo OBJ defended 
Taroo defended Ziroo. 

What the question context shows is that even in a simple declarative sentence, the 
use of the topic marker wa contrasts with the subject marker ga in what is understood 
as already in the discourse context. For instance, in a discourse initial utterance, 10a 
assumes no shared information or that someone defended Ziroo and asserts that the 
someone is Taroo. In 10b, the discourse-old proposition is that Taroo did something and 
what is asserted is that what he did was to defend Ziroo. 

(10) a. Taroo ga Ziroo o bengosimasita. 
Taroo subj Ziroo OBJ defended 
Taroo defended Ziroo. 
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b. Taroo wa Ziroo o bengosimasita. 

Taroo top/subj Ziroo OBJ defended 
Taroo defended Ziroo. 

While topics are often subjects, subject and grammatical topic need not coincide. 
Any argument can be realized as a topic, as shown in examples 11 and 12. 

(11) Taroo wa Hanako ga bengosita. 
Taroo TOP Hanako SUBJ defended 
As for Taroo, Hanako defended (him). 



(12) Tokyoo e wa Hanako ga itta. 
Tokyo to TOP Hanako SUBJ went 
To Tokyo, Hanako went. 

The assumption that the topic is more salient than the subject, when the two are 
different, is supported by the fact that an indefinite NP in subject position such as who, 
which, or somebody cannot be regarded as the TOPIC: an indefinite np is never marked 
by the topic marker wa, but by the subject marker ga. For example: 

(13) Dono hito ga Ziroo o bengosimasita ka. 
which person SUBJ Ziroo OBJ defended Q 
Which person defended Ziroo? 



(14) *Dono hito wa Ziroo o bengosimasita ka. 

who person top/subj Ziroo OBJ defended Q 
Which person defended Ziroo? 

It is clear from these examples that the grammatical topic, wa-marked entity, in 
Japanese, represents assumable shared information in an on-go ing conversa t ion. It has 



been taken to be the 'theme' or 'what the sentence is about' (Kuno, 1973; Shibatani, 



199C). In our framework, this is the role of the Cb. We will provide evidence supporting 
this position in section ^. However we claim that this is just a default and that other 
factors can contribute to establishing or continuing an entity as the Cb. Kuno also claims 
that a zero subject is equivalent to a wa-marked entity and we provide support for 
this claim in section ^, showing that the property of having previously been the Cb, in 
combination with being realized by a zero, contributes to an entity being the Cp. 



3.2 Empathy 

Kuno (1976) proposed a notion of EMPATHY in order to present the speaker's position 
or identification in describing a situation. In a hugging situation involving a man named 
Taroo and his son Saburoo, Kuno notes that this situation can be described in various 
ways, some of which are shown in 15. 

(15) a. Taroo hugged Saburoo. 

b. Taroo hugged his son. 

c. Saburoo's father hugged him. 
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These sentences differ from each other with respect to camera angle, the position that 
the speaker takes to observe and describe this situation. In 15a, the speaker is assumed 
to be describing the event objectively: the camera is placed at the same distance from 
both Taroo and Saburoo. On the other hand, the camera may be placed closer to Taroo 
in 15b and closer to Saburoo in 15c. This is shown by the use of relational terms such 
as son and father, respectively. The term empathy is used for this camera angle, which 
indicates the speaker's position among the participants in the event described p| 

In Japanese the realization of speaker's empathy is especially important when de- 
scribing an event involving givi ng or receiving. There is no way to describe a giving and 



receiving situation objectively ( Kuno and Kaburaki, 1977 ). In 16, the use of the verb 



kureru indicates the speaker's empathy with Ziroo, the discourse entity realized in object 
position, while in 17, the speaker's empathy with the subject Taroo is indicated by the 
use of the past tense form yatta of the verb yaru. 

(16) Taroo ga Ziroo ni hon o kureta. 
Taroo SUBJ Ziroo OBj2 book OBJ gave 
Taroo gave Ziroo a book. EMPATHY=OBj2=ZlROO 



(17) Taroo ga Ziroo ni hon o yatta. 
Taroo SUBJ Ziroo OBj2 book OBJ gave 

Taroo gave Ziroo a book. EMPATHY=SUB=TAROO 

A verb that is sensitive to the speaker's empathy is an empathy-loaded verb. 
The EMPATHY LOCUS is the argument position whose referent the speaker automatically 
identifies with. In other words, the verb kureru has the empathy LOCUS on the object, 
while verbs like yaru place the empathy LOCUS on the subject. 

The use of deictic verbs such as kuru ('come'), iku ('go'), okuru ('send to'), and 
yokosu ('send in') also encode speaker's empathy. For example, the speaker indicates 
empathy with Taroo by using the past tense form kita of the verb kuru in the following 
example. 

(18) Hanako wa Taroo no tokoro ni kita. 
Hanako TOP / SUBJ Taroo of place to came 
Hanako came to Taroo 's place. 

Many Japanese verbs can be made into empathy-loaded verbs due to a productive 
verb-compounding operation by which these empathy-loaded verbs are used as the aux- 
iliary verb, attaching to the main verb.[^ For example, kureru can be used as a suffix, to 
mark OBJ or OBj2 as the empathy LOCUS. The attachment of yaru marks subject as the 



15 The speaker's position is not determined by his physical proximity but also meas 
emotional or social relationship. In this sense, the term speaker's identification (^Kuno, 1976h|) may 
be more suitable than th e term spea ker's position. Furthermore, the notion of empathy is different 
from that of perspective (lida, 1992). Empathy is the speaker's identification with a discourse 
entity, but the speaker does not have to take the perspective of the person who he empathizes with. 
For example, consider the following utterance: 

(i) Taroo wa Hanako ni migigawa no hon o totte-kureta. 

Taroo TOP/SUBJ Hanako 0B,l2 right GEN book OBJ take-gave 
Taroo did Hanako a favor in taking a book on his/her right. 
In this example, the speaker empathizes with Hanako as indicated by the empathy verb kureru, 
yet he still can describe the given situation from Taroo's perspective, which is indicated by 
ambiguity in the interpretation of the deictic expression migigawa no ('right of). 

16 Certain intransitive verbs cannot be made into empathy-loaded verbs since the empathy-loaded 
versions make no sense, e.g. moreru (leak). 
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EMPATHY LOCUS. The complex predicate made by this operation inherits the empathy 
LOCUS of the suffixed verb. For example: 

(19) Hanako ga Taroo ni hon o yonde-kureta. 
Hanako SUBJ Taroo OBj2 book OBJ read-gave 

Hanako did Taroo a favor in reading a book. EMPATHY = OBj2 = TAROO 

In this case Taroo is interpreted as the empathy LOCUS due to the auxiliary kureta 
attached to the main verb. Similarly in 20, the speaker indicates empathy with Hanako 
by using the past tense form yatta of the verb yaru as an auxiliary verb to the main verb 
tazuneru. 

(20) Hanako ga Taroo o tazunete-yatta. 
Hanako SUBJ Taroo OBJ visit-gave 

{lit.)Hanako received a favor in visiting Taroo. EMPATHY = SUBJ = HANAKO 
As demonstrated in the following examples, a discourse entity that is realized as the 

EMPATHY LOCUS must be EVOKED. 

(21) Taroo ga Ziroo ni okane o kasite-kureta. 
Taroo SUBJ Ziroo OBj2 money OBJ lend-gave 
Taroo did Ziroo a favor in lending him some money. 

(22) *Taroo ga dareka ni okane o kasite-kureta. 
Taroo SUBJ somebody OBj2 money OBJ lend-gave 
Taroo did somebody a favor in lending him some money. 

(23) *Taroo ga misiranu hito ni okane o kasite-kureta. 
Taroo SUBJ unknown person OBj2 money OBJ lend-gave 
Taroo did a stranger a favor in lending him some money. 

The contrast between 21, 22, and 23 demonstrates that the use of a brand-new 
entity in the empathy LOCUS position of the verb give is not acceptable. Therefore an 
entity in the empathy-Iocus position is ranked in a higher position on the Cf than the 
subject. 

3.2.1 Empathy and the Centering Algorithm Using the Centering Algorithm, we 
model empathy as a language-specific discourse factor by adding the EMPATHY-marked 
discourse entity to the Cf ranking. Then preferences for continue over retain when 
EMPATHY is involved can be demonstrated, as in example 24 below:^ 

(24) a. Hanako wa kuruma ga kowarete komatteimasita. 

Hanako top / SUBJ car SUBJ broken at a loss- was 
Her car broken, Hanako was at a loss. 

Cb: HANAKO 

Cf: [hanako, car] 



17 The verb form kuremasita in (24)b is the polite form of kureta, the past tense form of the verb 
kureru. 
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Taroo ga sinsetu-ni te o kasite-kuremasita. 

Taroo SUBJ OBj2 /emp kindly hand OBJ lend-gave. 
Taroo kindly did (Hanako) a favor in helping her. 



Cb: 


[hanako] 




Cf: 


[hanako, 


taroo] 




EMPATHY 


SUBJ 



c. 



Tugi no hi eiga ni sasoimasita. 

next of day SUBJ OBJ movie to invited 

Next day (Hanako) invited (Taroo) to a movie. 



Cb: 


hanako 








Cfl: 


[hanako, 


taroo] 


CONTINUE 


16 




SUBJ 


OBJ 






Cf2: 


[taroo. 


HANAKO ] 


RETAIN 


2 




SUBJ 


OBJ 







In 24c, the verb invited requires an animate subject and object, and these must be 
realized by different discourse entities due to the contraindexing constraint. Hanako is the 
most highly ranked entity from 24b that is realized in 24c, and therefore must be the Cb. 
The preferred interpretation is therefore she invited him to a movie (Z = 5.25, p < .001). 
This corresponds to Cfl, the more highly ranked continue transition, in which Hanako 
is the preferred center, Cp. This interpretation can be found with minimal processing by 
trying the Cp(24b), Hanako, as the Cb(24c), by interpreting the subject zero as Hanako. 
This gives a continue transition. Then contraindexing constraints mean that Hanako 
cannot fill both argument positions, so the object position is interpreted as Taroo. This 
interpretation is found with minimal processing by interleaving the steps of the centering 
algorithm proposed in ( Brcnnan, Friedman, and Pollard, 1987 ). 

Note that nothing special needs to be said about the fact that empathy is the 
discourse factor that mad e Hanako the Cp in 24b and thus predicte d that Hanako would 
be the Cb at 24c {pace (Brcnnan, Friedman, and Pollard, 1987)). The preference in 



the interpretation follows from the distinction between continue and retain and the 
ranking of Cf . Thus, the centering framework is easily adapted to handle this language 
specific feature. 



3.3 Topic and empathy 

In general the assignment of the empathy relationship is pragmatic. It is determined by 
the speaker's relation to the discourse participants in the discourse. In 24, for example, 
the EMPATHY relationship between the speaker and Hanako and between the speaker 
and Taroo is clear: the use of the empathy verb in the second sentence indicates that the 
speaker is closer to Hanako than to Taroo. 

However, besides cases where the speaker clearly expresses who s/he empathizes 
with, it is also possible for the context to provide some information about the speaker's 
proximity relationship with discourse participants in the given discourse, so that the 
hearer can determine the empathy relation that the speaker has in mind. In this paper, 
we only consider cases where empathy is syntactically marked by the use of empathy- 
loaded verbs. 

Kuno's notion of EMPATHY is more general. For instance, Kuno's empathy hier- 
archy co nsists of diffe rent scales for empathy that include notions such as topic and 
SPEAKER ( |Kuno, 1987 ). Kuno's Topic Empathy Hierarchy suggests that the discourse 



entity reahzed as the topic will often coincide with the empathy LOCUS: 
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Topic Empathy Hierarchy: Discourse- Topic > Discourse-Nontopic 
Given an event or state that involves A and B such that A is coreferential 
with the topic of the present discourse and B is not, it is easier for the 
speaker to empathize with A than with B 

In support of Kuno's claim, we have found that when no empathy relation is clearly 
indicated and no topic has been clearly established that it is difficult for a hearer to 
determine the empathy relation that the speaker intends. Previous Cbs and current Cps 
can be high on the empathy scale, and yet the discourse entity realized as the grammatical 
TOPIC does not necessarily coincide with the discourse entity realized as the empathy 
LOCUS. A simple sentence to show this point is given below: 

(25) Taroo wa Ziroo ni hon o yonde-kuremasita. 

Taroo top/sub j Ziroo OBj2 book OBJ read-gave 

Taroo gave Ziroo a favor of reading a book, empathy = OBj2 = ziROO 

In example 25, Taroo is the topic while Ziroo is the empathy locus. Similarly, 
a zero does not have to be realized as the empathy locus. In 26b the zero in subject 
position realizes the Cb and refers to Taroo. 

(26) a. Taroo wa syukudai o zenbu yari-oemasita. 

Taroo top / SUB homework OB j all do-finished 

Taroo finished his homework. 

b. Ziroo ni hon o yonde-kuremasita. 
SUBJ Ziroo OBj2 book OBJ read-gave 

(Taroo) gave Ziroo a favor of reading a hook, empathy = obj2 = ziroo 

TOPIC is higher than EMPATHY in the Cf ranking. The higher degree of salience of 

topic over empathy is shown by the different interpretation of (b) sentences in examples 
27 and 28. The only difference in these examples is that Mitiko is wa-marked in 27a but 
is ^a-marked in 28a: 

(27) a. Mitiko wa kanai o gityoo ni osite-kuremasita. 

Mitiko TOP/SUBJ wife obj/emp chairman obj2 recommend-gave 
Mitiko did my wife a favor in recommending her as chairperson. 

b. asu no kaihyoo-kekka o tanosimi-ni siteim asu. 

SUBJ tomorrow of results OBJ look-forward doing-is 

(Mitiko) is looking forward to tomorrow's results. 

(28) a. Mitiko ga kanai o gityoo ni osite-kuremasita. 

Mitiko SUBJ wife OBJ/emp chairman OBj2 recommcnd-gave 
Mitiko did my wife a favor in recommending her as chairperson. 

b. asu no kaihyoo-kekka o tanosimi-ni siteimasu. 

SUBJ tomorrow of results OBJ look-forward doing-is 

(Mitiko) is looking forward to tomorrow's results. 
(My wife) is looking forward to tomorrow's results. 
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The TOPIC Mitiko is preferred as the unexpressed subject of the (b) sentence in 27. 
^ On the other hand, the subject Mitiko is not strongly preferred as shown in 28: the 
zero in the second sentence in 28 is understood as referring to either Mitiko or my wife. 
That is, the possible interpretation in these examples shows that the np my wife, which 
is reahzed as the empathy locus, is not as sahent as the topic Q 

So why is it easier to empathize with a discourse entity that has been the topic as 
Kuno demonstrates? It seems important to keep the notions of topic and empathy 



separate, but in section 5.1 we will demonstrate an effect where the topic entity is inter- 
preted as the empathy locus. We claim that the ranking of the Cf and the potential for 
a CONTINUE interpretation determines whether this effect will hold. In other words, the 
tendency for the topic entity to be interpreted as the empathy locus follows from more 
general discourse processing factors, such as a hearer preferring continue transitions 
within a given local stretch of discourse. 

3.4 Summary 

To summarize, we have outlined the roles of discourse markers such as those for topic 
and EMPATHY by which Japanese grammaticizes some aspects of discourse function, and 
we have argued that topic and empathy markers can only be used on entities that are 
already in the discourse context. 

One factor that hasn't been discussed is the role of pronominalization, but many 
researchers have argued that discourse entities realized by pronouns are more salient 



than other discourse entities ( 


Clark and Haviland, 1977 


; Grosz, Joshi, and Weinstcin, 


1986; 


Kuno, 1976b 




Kuno, 1987 


). We take zeros in Japanese to be analogous to pronouns 



in English in this respect. Since pronominalization can apply at any position in the 
ranking of the Cf, the role of its contribution is particularly interesting when it is in 
conflict with some other factor such as grammatical function or topic marking. This will 
be discussed further in section |^. 

4 Initial Center Instantiation 

Initial center instantiation is a process by which a discourse entity introduced in a 
segment-initial utterance becomes the Cb. In our framework, this happens as a side effect 
of the Centering Algorithm. Typically, when an interpretation is found for the second 
utterance in a discourse segment, the Cb becomes instantiated.^ The Cb of an initial 
utterance Ui is treated as a variable which is then unified with whatever Cb is assigned 
to the subsequent utterance Ui+i. 

Typically, a discourse entity is introd uced as a ga-marked subject , and then is referred 



to by a zero in a subsequent utterance ( Clancy and Downing, 1987 ). Consider example 
29. 



18 The zero may be interpreted as indirectly referring to the speaker. This interpretation is always 
possible when the verb kureru is used: the use of kureru impHes that the speaker is closer to the 
beneficiary argument (i.e. the o-marked NP in these examples), and the favor given to this person is 
understood as a benefit to the speaker as well. 

19 Although it seems as though empathy isn't higher than subject, the conflating factor is that topic 
marking_establishes a Cb whereas in 28 no Cb has been established. This is explained in detail in 
secticLLi.y , 

20 In ( WalKer, lida, and Cote, 199C| ) we called this Center Establishment. Henceforth we will refer to 
this process as Center Instantiation in order to avoid cnnfnginn with yampyarna's term center 
establishment, which is a different mechanism in her theory (Kameyama, 1985). 
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(29) 



Taroo ga deeta o konpyuutaa ni utikondcimasita. 
Taroo SUBJ data OBJ computer in was-storing 
Taroo was storing the data in a computer. 



Cb: 


[?] 


Cf: 


[taroo, data] 



yatto hanbun yari-owarimasita. 
SUBJ finally half do-finishcd 
Finally (Taroo) was half finished. 



Cb: 


TAROO 




Cf: 


[taroo] 


CONTINUE 



Using Taroo as the subject in 29a is not enough to establish this discourse segment 
as being about Taroo. It is the use of the zero in 29b that serves to instantiate Taroo as 
the Cb. By our definition of continue, 29b is a continue transition, because Cb(29b) = 
Cp(29b) and there was no Cb in 29a. However, Kuno argues that referring to a discourse 
entity with a zero is equivalent to marking it as the grammatical topic with wa ( Kuno, 
197^). Our interpretation of this argument is that the use of wa in a discourse initial 
utterance instantiates the wa-marked entity as the Cb in one utterance. This claim is 
supported by the contrast with the GA-WA alternation in examples 30 and 31, where 
there is a shift in interpretation depending on whether Taroo is marked with wa in the 
first sentence.^ 



(30) 



Taroo ga Ziroo o min'na no mae de tatakimasita. 

SUBJ OBJ all of front in hit. 

Taroo hit Ziroo in front of all the other people. 



Cb: 


[?] 


Cf: 


[taroo, ziroo] 



Itiniti-zyuu, kanzen-ni niusi-simasita. 
all-day completely ignored 
(Ziroo) ignored (Taroo) all day. 



Cb: 


TAROO 




Cf: 


[taroo, ziroo] 


3 



Cb: 
Cf: 



ZIROO 

[ziroo, taroo] 



In example 30, Taroo is introduced by ga. In this case, it appears that there is 
tendency due to lexical semantics to instantiate Ziroo as the Cb in the second utterance.^ 

By the centering definitions, taking either Taroo or Ziroo to be the Cb can result in 
a continue interpretation. However, assuming that the Cf ordering at 30a is correct, 
constraint 3 is violated by the preferred interpretation of 30b. Since both of the entities in 
Cf(30a) are realized, the Cb in 30b should be the most highly ranked one. There are two 
possible conclusions here: (1) In discourse initial utterances, when no clear indication of 
topic is given, the Cf ordering alone is not a strong constraint; (2) the ordering of the Cf 



21 These examples were tested by asking survey participants to indicate preference rankings. The 
numbers given here are only for those subjects who expressed strong preferences; some subjects 
expressed no preference. 

22 The number of subjects here are too small to test statistically. 
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should be partly determined by lexical semantics or other knowledge about the situation 
being described. However compare 30 with 31. 

(31) a. Taroo wa Ziroo o min'na no mae de tatakimasita. 
SUB J OBJ all of front in hit. 

Taroo hit Ziroo in front of all the other people. 



Cb: 


[taroo] 


Cf: 


[taroo, ziroo] 



b. Itiniti-zyuu, kanzen-ni musi-simasita. 
all-day completely ignored 
(Taroo) ignored (Ziroo) all day. 



Cb: 


TAROO 




Cf: 


[taroo, ziroo] 


10 



Cb: 


ZIROO 




Cf: 


[ziroo, taroo] 


4 



The use of wa in 31 seems to override the semantic preference that was exhibited in 
30, so that subjects now prefer an interpretation in which Taroo is the Cb.^ This shows 
that Taroo has not been instantiated as the Cb when it is time to interpret the two zeros 
in 30b. We explain the contrast by assuming that the topic instantiates the Cb when it 
is first introduced in a discourse initial utterance such as in 31a. Then the only way to 
get a CONTINUE interpretation for 31b is for Taroo to be the Cb at 31b. 

Furthermore, we can detect no differences in the interpretation of the final utterance 
between 3 utterance sequences in which an entity is introduced by wa, and 4 utterance 
sequences in which an entity is first introduced by ga and then realized by a zero in the 
second utterance. This provides further support for the claim that the status of discourse 
entities realized as grammatical topics and those realized as zero subjects is equivalent. 

4.1 Summary 

In sum, we have argued that the use of wa in a discourse initial utterance instantiates 
the wa-marked entity as the Cb. Cb instantiation can equivalently be done with a 2 
utterance sequence in which the entity is first introduced as a subject, (^a-marked, and 
then established as the Cb in the following utterance with a zero referring to that entity. 
In addition, the fact that the Cb is uninstantiated in discourse initial utterances has the 
effect that the Cf ranking in a discourse initial utterance is not a strong constraint as it 
is once a Cb is established. 

5 Zero Topic Assignment 

In this section we introduce the notion of a zero topic and a rule or assumption that 
can be employed as part of the interpretive process called zero topic ASSIGNMENT. 

The rule of zero topic assignment defines our distinction between grammatical 
topic and zero topic. This rule allows a zero that has just been the Cb to continue as the 
Cp, even when it is not realized in a discourse salient syntactic position such as subject. 
We will demonstrate this with examples that realize both grammatical and zero topics. 
In these cases, the discourse situation is such that the hearer may maintain multiple 



23 The small number of subjects means that we can't provide statistical support for this claim. 
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hypotheses about where the speaker's attention is directed, and must determine whether 



Zero Topic Assignment 

When a zero in Ui+i represents an entity that was the Cb(Ui), and when 
no other continue transition is available, that zero may be interpreted 
as the ZERO topic of Ui+i. 

What this means is that, in certain discourse environments, the entity that was pre- 
viously the Cb is predicted to continue as the Cb. We conjecture that ZTA is applicable 
in all free word-order languages with zeros However zero topic assignment is op- 
tional; here we have suggested 2 constraints on when it applies. We will give examples 
below of cases where it doesn't apply. 

The option of ZERO TOPIC assignment (henceforth ZTA) has been overlooked in 
previous treatments of zeros in Japanese. ZTA explains why the discourse entity Hanako, 
which is realized as OBJECT2 in 32c is interpreted as the SUBJECT of 32d. 

(32) a. Hanako wa siken o oete, kyoositu ni modorimasita. 

Hanako top/subj exam OBJ finish classroom to returned 
Hanako returned to the classroom, finishing her exams. 

Cb: hanako 

Cf: [hanako, exam] 

b. hon o locker ni simaimasita. 
SUBJ book OBJ locker in took-away 
She put her books in the locker. 



Cb: 


HANAKO 




Cf: 


[hanako, book] 


CONTINUE 



c. Itumo no yooni Mitiko ga mondai no tokikata o setumeisidasimasita. 
always like SUBJ Mitiko OBj2 problem solve-way OBJ explained 
Mitiko, as usual, explained (to Hanako) how to solve the problems. 



Cb: 


HANAKO 






Cfl: 


[hanako, mitiko, 


solution] 


ZTA continue 




TOP, SUBJ, OBJ 






Cf2: 


[mitiko, hanako. 


solution] 


retain 




SUBJ, 0BJ2, OBJ 







d. ohiru ni sasoimasita. 
SUBJ OBJ lunch to invited 
(Hanako) invited (Mitiko) to lunch. 



24 While some of the utterance sequences we examine are potentially ambiguous for native speakers, 
the examination of these discourse situations offers considerable insight into those where there is no 
ambiguity. 

25 We only look at object topics here but there may be limits as to how lowly ranked on the Cf and 
entity can be and still be the zero topic, e.g. by-passive agentive. 
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Cbl: 
Cfl: 


HANAKO 

[hanako, lunch, mitiko] 

SUBJ, 0BJ2, OBJ 


CONTINUE from Cfl(c) 


28 


Cb2: 


MITIKO 






Cf2: 


[mitiko, lunch, hanako] 


SMOOTH-SHIFT from Cf2(c) 


6 




SUBJ, 0BJ2, OBJ 







The possibility of ambiguity as to the attentional state of the speaker is reflected in 
the fact that there are two possible Cfs for 32c; Cf2 of 32c is the only Cf possible without 
ZTA, and represents a retain rather than a continue. By the formulation of the ZTA 
rule above, ZTA is triggered by the fact that no continue transition is available. 

The availability of ZTA means that hanako can be the Cp even when mitiko is re- 
alized as the subject. This leads to a potential ambiguity in 32d, because it is possible for 
a hearer to simultaneously entertain both of the Cf(32c). In this case the ZTA interpre- 
tation is preferred {Z = 4.95, p < .001). The less preferred SMOOTH-SHIFT interpretation 
would result from the algorithm's application to Cf2 of 32c. 

ZTA explains the contrast between the discourse segments in example 32 above and 
33 below. The only difference between 32 and 33 is that in 32c, mitiko is a ga-ma.rked 
subject, whereas in 33c, mitiko is a wa-marked subject/grammatical topic. Utterances 
32c and 33c have the same meaning. This minimal pair provides a test to see whether 
ZTA actually characterizes these discourse related effects. 

(33) a. Hanako wa siken o oete, kyoositu ni modorimasita. 

Hanako top/subj exam OBJ finish classroom to returned 
Hanako returned to the classroom, finishing her exams. 

Cb: hanako 

Cf: [hanako, exam] 

b. hon o locker ni simaimasita. 
SUBJ book OBJ locker in took-away 
(Hanako) put (her) books in the locker. 

Cb: hanako 

Cf: [hanako, book] continue 

c. Itumo no yooni Mitiko wa mondai no tokikata o setumeisidasimasita. 
always like top/subj Mitiko OBj2 problem solve- way OBJ explained 

Mitiko, as usual, started explaining (to Hanako) how to solve the problems. 



Cb: 


HANAKO 






Cfl: 


[hanako, mitiko. 


solution] 


ZTA continue 




TOP, SUBJ, OBJ 






Cf2: 


[mitiko, hanako. 


solution] 


retain 




TOP, 0BJ2, OBJ 







d. ohiru ni sasoimasita. 
SUBJ OBJ lunch to invited 
(Hanako) invited (Mitiko) to lunch. 
(Mitiko) invited (Hanako) to lunch. 



26 See section ti for an example of how a smooth-shift interpretation is calculated. 
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Cbl: 
Cf2: 


HANAKO 

[hanako, lunch, mitiko] 

SUBJ, 0BJ2, OBJ 


CONTINUE from Cfl(c) 


18 


Cb2: 


MITIKO 






Cf2: 


[mitiko, lunch, hanako 


] SMOOTH-SHIFT from Cf2(c) 


16 




SUBJ, 0BJ2, OBJ 







The wa marking has the predicted effect. Using the grammatical topic marker wa in 
33c dampens ZTA and thus affects the interpretation of 33d, which is now completely 
ambiguous {Z — 0.34, not significantly different than chance). Because the discourse 
entity realized as the grammatical topic and indicated by the wa-marked np is the Cp by 
default, 10 subjects who previously did, can no longer get an interpretation that depends 
on ZTA. The situation can be characterized as a case of competing defauhs, so that in 
33, some hearers apply the default that the wa-marked entity is the Cp, and others apply 
the default that continue interpretations are preferred and that zeros realize discourse 
entities that are ranked highly on the Cf. 

The RETAIN interpretation in 33c, Cf2, indicates that these hearers expect the con- 
versation to shift to being about Mitiko; the fact that Mitiko is the Cp(33c), along with 
constraint 3 will force a shift. Given a shift, the Mitiko invited Hanako to lunch inter- 
pretation is preferred because it is the more highly ranked SMOOTH-SHIFT transition.]^ 

These examples clearly show that the wa-marked np is not always the Cp and sup- 
port Shibatani's claim that the interpretation of wa depends on the discourse context 
( ^hibatani, 199C ) . The astute reader will have noticed that in the cases where Hanako is 
a zero topic, the wa-marked Mitiko discourse entity is ranked according to grammatical 
function. We conjecture that an inference of contrast is supported when the grammatical 
topic is not the Cp. 

The following section discusses the interaction of ZTA with empathy. Then in section 
5.2, we discuss further the ramifications of our distinction between grammatical and zero 
topic. 



5.1 Empathy and Zero Topic Assignment 

This section investigates the interaction of empathy and zero topic assignment 
(ZTA) . The discourse segment in 34 is a minimal pair with that in 35. In 34d the 
main verb is setumeisita ('explain') without any empathy marking, whereas in 35d, the 
same sentence occurs with an auxiliary empathy verb as setumeisite-kureta. Remember 
that kureta marks the OBJ or OBj2 as the empathy LOCUS. 



(34) a. Taroo wa deeta o konpyuutaa ni utikondeimasita. 

Taroo TOP/SUBJ data OBJ computer in was-storing 
Taroo was storing the data in a computer. 



Cb: 
Cf: 



TAROO 

[taroo, data] 



yatto hanbun yari-owarimasita. 
SUBJ finally half do-finished 
Finally (Taroo) was half finished. 



27 If MITIKO could represent a topic object in 33d, there would be anotlier equally ranked 
SMOOTH-SHIFT interpretation for 33d. However, according to the formulation of ZERO TOPIC 
ASSIGNMENT, MITIKO can not be a zero topic because it was not the Cb of the previous utterance, 
33c. 
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Cb: 


TAROO 




Cf: 


[taroo] 


CONTINUE 



c. Ziroo ga hurui deeta o misemasita. 
Ziroo SUBJ OBj2 old data OBJ showed 
Ziroo showed (Taroo) some old data. 



Cb: 


TAROO 




Cfl: 


[taroo, ziroo, data] 


ZTA CONTINUE 




TOP, SUBJ, OBJ 




Cf2: 


[ziroo, taroo] 


RETAIN 




SUBJ, 0BJ2, OBJ 





d. ikutuka no kuitigai o setumeisimasita. 
SUBJ OBj2 several of differences OBJ explained 
(Ziroo) explained several differences to (Taroo). 
(Taroo) explained several differences to (Ziroo). 



Cbl: 
Cfl: 


TAROO 

[taroo, ziroo, differences] 

SUBJ, 0BJ2, OBJ 


CONTINUE from Cfl(c) 


12 


Cb2: 


ZIROO 






Cf2: 


[ziroo, taroo, differences] 


SMOOTH-SHIFT from Cf2(c) 


22 




SUBJ, 0BJ2, OBJ 







The interpretations of 34d show that it is possible for some subjects to interpret Taroo 
as the zero topic in 34c. This is possible because Taroo was both the Cp and the Cb for 
34a and 34b. The two Cfs of 34c reflect multiple possibilities in attentional statej^ The 
competing defaults consist of the assumption that ZTA applies, versus the assumption 
that subjects are more highly ranked than objects on the Cf. In this case no preference 
between the two interpretations can be demonstrated {Z — 1.79, not significant). 

Example 35 is a minimal pair with 34. In 35d, the speaker provides more syntac- 
tic information by using the empathy verb kureta to indicate that the discourse entity 
reahzed as the object2 is the empathy locus. 

(35) a. Taroo wa deeta o konpyuutaa ni utikondeimasita. 

Taroo top/sub J data OBJ computer in was-storing 
Taroo was storing the data in a computer. 



Cb: 


TAROO 


Cf: 


[taroo, data] 



b. yatto hanbun yari-owarimasita. 
SUBJ finally half do-finished 
Finally (Taroo) was half finished. 



Cb: 


TAROO 




Cf: 


[taroo] 


CONTINUE 



c. Ziroo ga hurui deeta o misemasita. 
Ziroo SUBJ OBj2 old data OBJ showed 
Ziroo showed (Taroo) some old data. 



28 Although both possibiUties have the same semantic interpretation. 
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Cb: 


TAROO 




Cfl: 


[taroo, ziroo, data] 


zta continue 




TOP, SUBJ, OBJ 




Cf2: 


[ziROO, TAROO, data] 


retain 




SUBJ, 0BJ2, OBJ 





d. ikutuka no kuitigai o setumeisite-KURE-masita. 

SUBJ OBj2/emp several of differences OBJ explained-gave 
( Ziroo ) did ( Taroo ) a favor of explaining several differences. 



Cbl: 


TAROO 




Cfl: 


[taroo, ziroo, differences] 


CONTINUE from Cfl(c) 33 




EMP-OBJ2, SUBJ, OBJ 




Cb2: 


ZIROO 




Cf2: 


[ziroo, taroo, differences] 


SMOOTH-SHIFT from Cf2(c) 1 




EMP-OBJ2, SUBJ, OBJ 





Empathy associates with the previous Cb to yield a continue transition, and the 
interpretation changes so that the utterance is no longer ambiguous {Z = 16.24, p < .001). 
In this case it is possible to interpret both 35c and 35d as continues by assuming ZTA 
at 35c. This example also va lidates ZTA because empathy associates with the zero topic 
( Kuno, 1976b ; Kuno, 1987 ). Furthermore, this minimal pair highlights aspects of the 
interaction between syntax and inference. The fact that the empathy verb in 35d is the 
only difference between 34 and 35 shows that the preference in interpretation does not 
follow from inferences based on information about who is likely to explain what to whom, 
depending on who showed who the data, or whether the data is new or old. 

Example 36 contrasts minimally with example 35 but on another dimension. In this 
case 36c is a continue with Taroo realized in subject position, rather than a continue 
based on ZTA. The Ziroo explained to Taroo interpretation is again clearly preferred here 
as in 35d(Z = 3.638,p < .001). 



(36) a. Taroo wa deeta o konpyuutaa ni utikondeimasita. 

Taroo TOP/SUBJ data OBJ computer in was-storing 
Taroo was storing the data in a computer. 



Cb: 


[taroo] 


Cf: 


[taroo, data] 



yatto hanbun yari-owarimasita. 
SUBJ finally half do-finished 
Finally (Taroo) was half finished. 



Cb: 
Cf: 



TAROO 

[taroo] 



CONTINUE 



c. Ziroo ni hurui deeta o misemasita. 
SUBJ Ziroo OBj2 old data OBJ showed 
(Taroo) showed Ziroo some old data. 



Cb: TAROO 

Cfl: [taroo, ZIROO, data] continue 

SUBJ, 0BJ2, OBJ 
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ikutuka no kuitigai o setumeisite-KURE-masita. 

SUB J OBj2/emp several of differences OBJ explained-gave 
(Ziroo) did (Taroo) a favor of explaining several differences. 



Cbl: 


TAROO 




Cfl: 


[taroo, ziroo, differences] 


CONTINUE 26 




EMP-OBJ2, SUBJ, OBJ 




Cf2: 


[ziroo, taroo, differences] 


RETAIN 8 




EMP-OBJ2, SUBJ, OBJ 





In 36 as in 35, EMPATHY associates with the previous Cb, ie. Taroo. We claim that 
this follows from the ordering of the Cf and hearers' preferences for a CONTINUE inter- 
pretation. 

Note that the interpretation of the last utterance in 36d remains the same as that in 
35d, although in this case it is Taroo that shows Ziroo some old data in 36c; nevertheless 
Ziroo is the one who does the explaining. It seems that inference from world knowledge 
and domain information alone is unlikely to predict which interpretations hearers will 



prefer. Inferential processes and discourse structure are mutually constraining (Joshi and 
We nstein, 1981; Nadathur and Joshi, 1983; Hudson-D'Zmura, 1 



5.2 Summary 

We proposed a discourse rule of ZERO TOPIC ASSIGNMENT and showed that ZTA is 
conditioned by the rules and constraints of centering theory: (1) ZTA only applies to 
discourse entities that were previously the Cb; (2) ZTA is constrained to cases where the 
only transition available otherwise would be a retain. 

ZTA arises from the interaction between preferences for continue transitions (Rule 
2) and the fact that Cbs are often zeros (Rule 1). The interaction of these two factors 
leads to the speculation that when the Cb is realized by a pronoun in a lower ranked 
Cf position, which gives rise to a retain transition state, that this type of transition 
is inherently ambiguous. Since different factors contribute to the salience of discourse 



entities, such as 'subjecthood' and 'pronominalization' (Grosz, Joshi, and Weinstein, 



1986), conflicting defaults can arise when these are in conflict with one another. This 
may be especially true in Japanese since another factor that should contribute to Cf 
ranking, word order, is not present whenever zeros are involved. 

These examples highlight the relation between centering and global coherence in 
discourse. A retain is proposed as a way for a speaker to mark a coordinated transition 
to a new topic; it predicts a shift ( |Grosz, Joshi, and Weinstein, 1986 ; Brennan, Friedman, 



[and Pollard, 1987 ). However, the way in which centering shift transitions are related to 



larger structures in discourse has not been specified. If a shift functions as a boundary 
between segments ( Walker, 1993b ), then the hearer's application of ZTA means that the 
hearer is assuming that the next utterance will be part of the same discourse segment. In 
contrast, a hearer's assumption that the current centering transition is a retain means 
that the hearer assumes that the next utterance will begin a new discourse segment. 

The relationship between segmentation and hearer's preferences for ZTA or retain 
interpretations may be affected by other discourse factors. Among these factors, into- 
nation may indicate whether the current utterance should be taken as initiating a new 
segment and predicting a shift, or continuing the previous one (Overman, 1987 ; Cahn, 



1992; 3werts and Geluykens, 1992; Walker and Prince, In Press). Another factor may 



be the inferred relationship that holds between adjacent utterances such as whether it 



is possible to interpret (d) as Ziroo's reason for having done (c) (Hobbs, 1985b). How- 
ever this is clearly not the only factor, or even necessarily the dominant one, as we 
have demonstrated. Future research must provide additional constraints on when ZTA 
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is applicable. 

6 Related Research 

Other researchers working on the interpretation of anaphors have focused on the role of 
inference from world knowledge (Hobbs, 1985b; Hobbs, 1979| ). While it is important to 



elucidate the information needed for inference and the type of inferential process involved 
in discourse interpretation, it is clear from our examples that syntactic realization has 
a strong effect on the interpretive process and may provide processing constraints on 
inferential processes. We have focused on the interaction between syntax and inference. 

Our trea t ment of Japanese discourse phenome na builds on earlier work by Kuno 
(|Kuno, 1972| ; |Kuno, 1973| ; |Kuno, 198^ ; |Kuno, 1989| ). Our Cf ranking is consistent with 



Kuno's Empathy and Topic Hierarchies and we incorporate a number of Kuno's obser- 
vations on the function of the grammatical topic marker wa and the role of zeros. We 
have also incorporated Kuno's notion of empathy by using empathy in the Cf ranking 



(Kuno, 1976a; Kuno and Kaburaki, 1977) 



In recent work, Kuno proposes an algorithmic account of the interpretation of ze- 
ros. He claims that ther e are two typ es of zero pronouns, pseudo-zero-pronouns and 
REAL-ZERO-PRONOUNS ( Kuno, 198^ . REAL-ZERO-PR ONOUNS are Suppo sed to have a 



wa- 



PSEUDO-ZERO- 



marked np or a presentational np as an antecedent ( Yoshimoto, 
PRONOUNS are actually examples of deletion, and must follow the same order and the 
same syntactic function as their source nps. They must obey constraints on deletion such 
as Kuno's Pecking Order of Deletion Principle: Delete less important information first 
and more important information last. According to Kuno, the position just to the left 
of the verb is the default focus position in Japanese, unless the verb itself is the focus. 
Therefore, since the verb in 37b is the information focus, the zeros are assumed to be 

PSEUDO-ZERO-PRONOUNS. 

(37) 

a. Taroo ga Hanako ni nani o sita no desu ka. 
Taroo SUBJ Hanako to what OBJ do COMP COPULA Q 
What did Taroo do to Hanako? 

b. kisu o sita no desu. 

kiss OBJ did COMP COPULA 
(lit.) (Taroo) did a kissing (to Hanako). 

The combination of these two types of zeros can explain examples like the following: 

(38) a. Taroo wa Hanako ga sukida. 

Taroo top/subj Hanako fond-of-is 
Taroo likes Hanako. 

b. Ziroo wa Natuko ga sukida. 
Ziroo top/subj Natuko fond-of-is 
Ziroo likes Natuko. 

c. Saburoo mo sukida. 

Saburoo also fond-of-is 
(Ziroo) also likes Saburoo. 
*Saburoo also likes (Natuko). 
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Kuno's account treats Ziroo in 38c as a real-zero-pronoun. In this case we would 
predict the preferred interpretation based on our distinction between continue and 
RETAIN. However consider the following example: 

(39) a. Taroo wa Hanako ga sukida. 

Taroo top/subj Hanako fond-of-is 
Taroo likes Hanako. 

b. Ziroo wa kirai da. 

Ziroo top/subj fond-of-is 
(Taroo) dislikes Ziroo. 
Ziroo dislikes (Hanako). 
* Ziroo dislikes (Taroo). 

The Taroo dislikes Ziroo interpretation would be an example of ZTA. However, we 
would predict that the Ziroo dislikes Hanako interpretation would be dispreferred, but 
this does not seem to be the case. Kuno's analysis treats the zero in the second reading 
of 39b as a pseudo-zero-pronoun which means that it must be interpreted as Hanako 
since Hanako was the object of the previous utterance. 

The interpretation of 39b that we would predict as possible would be the Ziroo dis- 
likes Taroo (retain) which native speakers rarely get. However Kuno's analysis does not 
block this reading either; the zero in 39b could also be a real-zero-pronoun, with 
Taroo as its antecedent. Kuno says that this interpretation is dispreferred because of a 



preference for parallel interpretation (Grober, Beardsley, and Caramazza, 1978; Sidner, 
1979; Kameyama, 1988; Kuno, 1989). We have claimed here and elsewhere (Brennan, 
Friedman, and Pollard, 1987; Walker, lida, and Cote, 1990| ) that the preference for par- 



allelism is an epiphenomenon of the ordering of the Cf and the preference for continue 
interpretations. 

Our account cannot explain the contrast between 38 and 39. It seems that what is 
at issue here is the fact that a set of discourse entities plus an open proposition such 
as X likes Y is what is discourse-old in these examples and not just a discourse entity 
( Prince, 1981a ; Prince, 1986 ; Prince, 1992 ). Our conclusion is that these enumerated lists 
and question-answer discourse segments may need an account of discourse center that 
is broader than that needed for discourse entities realized as nps. Kuno's constraints 
on deletion must also be integrated to fully explain when entities or propositions in the 
discourse may be unexpressed. 



Our analysis also builds on an earlier analysis put forth by Kameyama (Kameyama, 



1985; Kameyama, 1986; Kameyama, 1988). Although Kameyama uses the centering ter- 



minology, her account is not based on the constraints and rules of Centering Theory as 
developed here a nd presented in ([Grosz, Joshi, and Weinstein, 1983 ; [Grosz, Joshi, and 



Weinstein, 1986; Brennan, Friedman, and Pollard, 1987). Kameyama proposed that the 
interpretation of zeros in Japanese depends on a default preference hierarchy of syn- 



tactic properties to be shared between the antecedent and the zero (Grober, Beardsley, 
[and Caramazza, 197S ). Kameyama's account of zero interpretation consists roughly of 



a PROPERTY-SHARING CONSTRAINT, henceforth PS, and an expected center order, 
henceforth ECO, which may be paraphrased as follows: 

Property-Sharing constraint: Two zero-pronouns in adjacent ut- 
terances, which co-specify the same Cb-encoding discourse entity, should 
share one of the following properties (in descending order of preference): 
1) both IDENT and subject, 2) ident alone, 3) SUBJECT-alone, 4) both 
nonident and nonsubject, 5) nonsubject alone, or 6) nonident 
alone. 
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Expected Center Order Rule: In a sentence that contains a center- 
establishing zero, if it is to have a full np as its antecedent, the default 
preference order among its potential antecedent nps is: Topic > Ident > 
Subject > Object(2) > Others. 

As noted earlier, we use a modified version of Kameyama's Expected Center 
Order as the ordering of the Cf, but Kameyama's treatment differs from ours in a 
number of respects. 

First, Kameyama used the property ident to describe something similar to Kuno's 
notion of empathy, and has an added assumption of a subject ident default, i.e. 
subjects are consider to be empathy loci by default. This means that her theory also 



includes a neutralization device for cases where this default is not in effect (Kameyama, 
198^). In contrast, our theory explains examples covered by the subject ident default 
by including EMPATHY in the ranking of the Cf list and by the distinction between 
continue and RETAIN as illustrated in example |4[ 

We have also expanded Kameyama's treatment of topic. We have elucidated the 
the interaction of topic with subject and empathy markers and supported our claim that 
the topic marker wa functions similarly to pronominalization in instantiating the Cb. In 
addition, ZTA and the distinction that we make between grammatical and zero topic is 
new to our account. 

Furthermore, since Center Instantiation is a side effect of the application of the 
Centering algorithm, we treat 40c and 41c with the same mechanism. In Kameyama's 
analysis, the PS constraint applies to 40, while the ECO applies in 41. 

(40) a. Hanako wa repooto o kakimasita. 

Hanako top/subj report OBJ wrote 
Hanako wrote a report. 

b. Taroo ni aini-ikimasita. 
subj-ident Taroo obj2 see-went 

She went to see Taroo. 

c. Taroo wa kibisiku hihansimasita. 
Taroo top/subj OBJ severely criticized 
Taroo severely criticized her. 



(41) a. Hanako wa Taroo ni aini-kimasita. 

Hanako top/subj Taroo OBj2 see-came 
Hanako came to see Taroo. 

b. Taroo wa hon o yonde-kure-masita. 

Taroo top/subj OBj2 book OBJ read-gave 
Taroo did her a favor of reading a book. 

Note that we annotate 40b with Kameyama's ident property, which corresponds to 
EMPATHY. Kameyama's account predicts that there are different processes going on in 
the resolution of zeros depending on the environments where the zero appears. PS applies 
in 40c because the previous utterance has a zero, but doesn't apply in 41b. PS would 
seem to predict that the zero pronoun in 40c should not be interpreted as Hanako, since 
the zero carries the properties [Subj, Ident] in 40b and [NonSubj, NonIdent] in 40c. 
In other words, none of the required properties of SUBJ, ident, nonsubj, nonident, 
which 'should' be shared according to the PS constraint, are shared. But in fact 40c is 
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perfectly acceptable under the intended reading of Taroo severely criticized Hanako and 
41b is likewise acceptable under the reading Taroo did Hanako a favor of reading a book. 

Also, as pointed out in (Kuno, 1989), Kameyama's theory makes no predictions 
about the interpretation of some of the zeros in examples such as |^, repeated here for 
convenience as 42. 



(42) 



Taroo wa saisin no konpyuutaa o kaimasita. 

TOP / SUB J newest of computer OBJ bought 
Taroo bought a new model of computer. 



Cb: TAROO 

Cf: [taroo, computer! 



John ni sassoku sore o misemasita. 
SUBJ John OBj2 at once that OBJ showed 
(Taroo) showed it to John. 



Cb: 


TAROO 




Cf: 


[taroo, JOHN, computer] 


CONTINUE 



c. 



atarasiku sonawatta kinoo o setumeisimasita. 
SUBJ OBj2 newly equipped function OBJ explained 
(Taroo) explained the newly equipped functions to (John). 



Cb: 


TAROO 






Cfl: 


[taroo, JOHN] 


CONTINUE 


27 




SUBJ OBJ 






Cf2: 


[JOHN, taroo] 


RETAIN 


1 




SUBJ OBJ 







The PS Constraint applies only to two zeros in adjacent sentences, and the ECO 
applies only when a Cb is to be established. 42c is not a Cb-establishing utterance since 
the Cb has already been established in 42b, so the ECO should not apply. The PS 
constraint does apply and predicts that the subject zero must have the subject of 42b 
as its antecedent, but the theory makes no predictions about the possible interpretations 
for the zero object. 

Many of the examples that are explained in Kameyama's theory by the PS constraint 
are handled on our account by the distinction between continue and retain. However, 
there are a number of cases where PS makes different predictions than our account. 
In particular note that for examples p2| and 35, Kameyama's subject ident default 



makes exactly the opposite prediction. |35| is repeated below as 43 and annotated with 
the SUBJECT IDENT default feature. 

(43) a. Taroo wa deeta o konpyuutaa ni utikondeimasita. 

Taroo TOP/SUBJ data OBJ computer in was-storing 
Taroo was storing the data in a computer. 

b. yatto hanbun yari-owarimasita. 
SUBJ /ident finally half do- finished 
Finally he was half finished. 

c. Ziroo ga hurui dccta o misemasita. 
Ziroo subj/ident obj2 old data obj showed 
Ziroo showed him some old data. 
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d. ikutuka no kuitigai o setumeisite-KURE-masita. 

SUB J OBj2/ident several of differences OBJ explained-gave 
(Ziroo) did (Taroo) a favor of explaining several differences. 

According to PS, the interpretation in which the property ident is shared is preferred 
to the one with SUBJECT shared, and hence, the interpretation Taroo did Ziroo a favor in 
explaining several differences is preferred. However our survey shows that native speakers 
prefer the Ziroo did Taroo a favor reading; this is explained by our discourse rule of ZTA 
and by including empathy in the ranking of the Cf list. 

7 Conclusion and Future Work 

In this paper, we have attempted to elucidate the the interaction of syntactic realization 
and discourse salience in Japanese using the discourse processing framework of cen- 
tering. In our theory discourse salience is operationalized by the ranking of the forward 
centers for an utterance. We explored speakers' options for indicating salience in Japanese 
discourse, especially the interaction of discourse markers for TOPIC and empathy. We 
then posited a ranking and used it to explain some facts about the interpretation of zeros 
in Japanese. 

While there is clearly a correlation between syntax and discourse function, we show 
that discourse context plays an important role. We proposed a discourse rule of ZERO 
TOPIC ASSIGNMENT (ZTA) which distinguishes grammatical and zero topic. We showed 
that centering allows us to formalize constraints on when ZTA can apply. However future 
work must determine additional constraints on when ZTA applies, and which language 
families support ZTA. 

The preferred interpretation of zeros and the discourse factors which are responsible 
for each interpretation are summarized below. Remember that in each case the zero in 
the third utterance was established as the Cb by the previous two utterances: 



Third Utterance 


Fourth Utterance 


Discourse Factor 


Example 


SUBJECT 


object(2) 


SUBJECT 


OBJECt(2) 






zcro(i) 


NP(j) 


zero(i) 


zcro(j) 


Continue /Retain 


1 


zero(i) 


NP(j) 


zero(j) 


zero (i) , empathy 


empathy, Continue/Retain 




NP(ga)(i) 


zero(j) 


zero(j) 


zero(i) 


ZTA 




NP(wa)(i) 


zero(j) 


zero(i) 


zero(j) 


WA-effect 








zero(j) 


zero(i) 


ZTA 




NP(ga)(i) 


zero(j) 


zero(i) 


zero(j), empathy 


ZTA and empathy 





This analysis suggests that centering may be a universal of context-dependent pro- 
cessing of language, although so far this theory has only been applied to English, German, 



Japanese, Italian and Turkish (Brcnnan. Friedman, and Pollard, 1987 




Walker, 1989; 


Walker, lida, and Cote, 1990; 


Di Eugenio, 1990; |Cote, 1992; Rambow, 1993; 


Nakatani, 


1993; 


Hoffman, In Press; 


Turan, 1995 


). We proposed that the centering component of a 



theory of discourse interpretation can be constructed in a language independent fashion, 
up to the declaration of a language-specific value for one parameter of the theory, i.e., Cf 
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ranking (as in section |^). This parameter is language-dependent because different lan- 
guages offer different means of expressing discourse function. We conjecture that ZTA 
may apply in any free- word order language with zeros. 

Future work must examine the interaction between centering and discourse segmen- 



tation in both monologue and dialogue ( Whittaker and Stcnton, 1988 ; Walker and Whit- 
tak ^r, 199C ; Walker, 1993b| ), and the role of deictics, lexical semantics, one anaphora. 



and propositional discourse entities in centering (Webber, 1978; ^idner, 1979 ; Walker 



199|^; [Walker, 1993a ; Cote, 199^ ). It is also important to examine the interaction of zeros 
with overt pronouns an d with deictics, and the interaction of pronominalization with 
accenting ( Tcrkcn, 1995). In addi tion, the semantic theory underlying centering must be 
further developed ( Roberts, 1995 ). Finally, centering transitions are currently defined by 
an equality relation between discourse entities, but poset relations and functional depen 



de ncies often link entities in discourse ( Prince, 1978b ; Prince, 1981a ; Ward, 1985 ; Grosz 
Jos |ii, and Weinstein, 198*6 ). The predictions mad e here should also be tested on a large 
corpus of naturally occurring Japanese discourse (Hurewitz and Linson, In Press) 
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9 Appendix: Instructions to Survey Participants 



9.1 Instructions for Survey 1 and 2 

What interpretation do you get for the THIRD sentence of each set where there are 
two unexpressed arguments? 0(i) in the second sentence indicates that the unexpressed 
argument in the sentence should be interpreted as referring to the np of the first sentence 
marked with (i). Please rank your preference: it's ok to have more than one equally 
preferred interpretation. 

9.2 Instructions for Survey 3 

Dear Participants. Thank you for serving as subjects for us for this informal experiment. 
You can help us most by following the directions here. Please read each sample discourse 
in turn and make your interpretation as rapidly as possible. Do not scroll back and forth 
in the file. Please indicate which interpretation, (a) or (b) you get by marking your choice 
with a 1. It is very important that you choose *one* interpretation only, and the one you 
choose should be the first one that you think of as you are reading the sample discourse. 
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Send us back this file with your choices marked. 
References 

[Brennan] Brennan, Susan E. Centering attention in discourse. Submitted. 

[Brennan, Friedman, and Pollardl987] Brennan, Susan E., Marilyn Walker Friedman, and 

Carl J. Pollard. 1987. A centering approach to pronouns. In Proc. 25th Annual Meeting of 

the ACL, Stanford, pages 155-162. 
[Cahnl992] Cahn, Janet. 1992. An investigation into the correlation of cue phrases, unfilled 

pauses and the structuring of spoken discourse. In Workshop on Prosody in Natural Speech, 

pages 19-31. Institute for Research in Cognitive Science, University of Pennsylvania, TR 

IRCS-92-37. 

[Carterl987] Carter, David M. 1987. Interpreting Anaphors in Natural Language Texts. Ellis 

Horwood, Chichester, England. 
[Clancy and Downingl987] Clancy, Patricia M. and Pamela Downing. 1987. The use of wa as a 

cohesion marker in Japanese oral narratives. In J. Hinds etal., editor. Perspectives on 

Topicalization: The case of Japanese wa. Academic Press, pages 3-56. 
[Clark and Havilandl977] Clark, Herbert H. and Susan E. Haviland. 1977. Comprehension and 

the given-new contract. In R. O. Freedle, editor, Discourse Production and Comprehension. 

Ablex Publishing, Norwood, N.J., pages 1-40. 
[Clark and MarshaU1981] Clark, Herbert H. and Catherine R. Marshall. 1981. Definite 

reference and mutual knowledge. In Joshi, Webber, and Sag, editors. Elements of Discourse 

Understanding. CUP, Cambridge, pages 10-63. 
[Cotel992] Cote, Sharon. 1992. Discourse functions of two types of null objects in english. In 

Linguistic Society of America Annual Meeting, page 12. 
[Cotcl995] Cote, Sharon. 1995. Ranking forward-looking centers. In Marilyn A. Walker, 

Aravind K. .Joshi, and Ellen F. Prince, editors, Centering tn Discourse. Oxford University 

Press. 

[Di Eugeniol990] Di Eugenio, Barbara. 1990. Centering theory and the Italian pronominal 

system. In COLING90: Proc. 13th International Conference on Computational Linguistics, 

Helsinki, pages 270-275. 
[Gordon, Grosz, and Gillioml993] Gordon, Peter C, Barbara J. Grosz, and Laura A. Gilliom. 

1993. Pronouns, names and the centering of attention in discourse. 
[Grober, Beardsley, and Caramazzal978] Grober, Ellen H., William Beardsley, and Alfonso 

Caramazza. 1978. Parallel function strategy in pronoun assignment. Cognition, 6:117-113. 
[Groszl977] Grosz, Barbara J. 1977. The representation and use of focus in dialogue 

understanding. Technical Report 151, SRI International, 333 Ravenswood Ave, Menlo Park, 

Ca. 94025. 

[Grosz, Joshi, and Weinsteinl983] Grosz, Barbara J., Aravind K. Joshi, and Scott Weinstein. 

1983. Providing a unified account of definite noun phrases in discourse. In Proc. 21st 

Annual Meeting of the ACL, Association of Computational Linguistics, pages 44-50. 
[Grosz, Joshi, and Weinsteinl986] Grosz, Barbara J., Aravind K. Joshi, and Scott Weinstein. 

1986. Towards a computational theory of discourse interpretation. Unpublished Manuscript. 
[Grosz and Sidnerl985] Grosz, Barbara J. and Candace L. Sidner. 1985. The structure of 

discourse structure. Technical Report CSLI-85-39, Center for the Study of Language and 

Information, Stanford, CA. 
[Grosz and Sidnerl986] Grosz, Barbara J. and Candace L. Sidner. 1986. Attentions, intentions 

and the structure of discourse. Computational Linguistics, 12:175 204. 
[Hajicova and Vrboval982] Hajicova, Eva and Jarka Vrbova. 1982. On the role of the hierarchy 

of activation in the process of natural language understanding. In COLING82: Proc. 9th 

International Conference on Computational Linguistics. Prague, pages 107-113. 
[Hasegawal984] Hasegawa, Nobuko. 1984. On the so-called zero pronouns in Japanese. The 

Linguistic Review, pages 289-341. 
[Hobbsl976a] Hobbs, Jerry R. 1976a. A computational approach to discourse analysis. 

Technical Report 76-2, Department of Computer Science, City College, City University of 

New York. 

[Hobbsl976b] Hobbs, Jerry R. 1976b. Pronoun resolution. Technical Report 76-1, Department 
of Computer Science, City College, City University of New York. 



35 



Marilyn Walker and Masayo lida and Sharon CoteJapanese Discourse and the Process of Centering 



[Hobbsl979] Hobbs, Jerry R. 1979. Coherence and corcforoncc. Cognitive Science, 3:67-90. 
[Hobbsl985a] Hobbs, Jerry R. 1985a. The logical notation: Ontological promiscuity, chapter 2 

of discourse and inference. Technical report, SRI International, 333 Ravenswood Ave., 

Menlo Park, Ca 94025. 

[Hobbsl985b] Hobbs, Jerry R. 1985b. On the coherence and structure of discourse. Technical 

Report CSLI-85-37, Center for the Study of Language and Information, Ventura Hall, 

Stanford University, Stanford, CA 94305. 
[Hobbs et al.l987] Hobbs, Jerry R., William Croft, Todd Davies, Douglas Edwards, and 

Kenneth Laws. 1987. The tacitus commonscnso knowledge base. Technical report, SRI 

International, 333 Ravenswood Ave., Menlo Park, Ca 94025. 
[Hobbs and Martinl987] Hobbs, Jerry R. and Paul Martin. 1987. Local pragmatics. Technical 

report, SRI International, 333 Ravenswood Ave., Menlo Park, Ca 94025. 
[HofFmanIn Press] Hoffman, Beryl. In Press. Word order, information structure and centering 

in turkish. In Marilyn A. Walker, Aravind K. Joshi, and Ellen F. Prince, editors. Centering 

in Discourse. Oxford University Press. 
[Hornl986] Horn, Laurence R. 1986. Presupposition, theme and variations. In Chicago 

Linguistic Society, 22, pages 168-192. 
[Hudson-D'Zmura and Tancnhausin press] Hudson-D'Zmura, Susan and Michael K. Tancnhaus. 

in press. Assigning antecedents to ambiguous pronouns: The role of the center of attention 

as the default assignment. In Marilyn A. Walker, Aravind K. Joshi, and Ellen F. Prince, 

editors. Centering in Discourse. Oxford University Press. 
[Hudson-D'Zmural988] Hudson-D'Zmura, Susan B. 1988. The Structure of Discourse and 

Anaphor Resolution: The discourse Center and the Roles of Nouns and Pronouns. Ph.D. 

thesis. University of Rochester. 
[Hurowitz and LinsonIn Press] Hurowitz, Felicia and Brian Linson. In Press. A quantitative 

look at discourse coherence. In Marilyn A. Walker, Aravind K. Joshi, and Ellen F. Prince, 

editors. Centering in Discourse. Oxford University Press. 
[Iidal992] lida, Masayo. 1992. Context and Binding in Japanese. Ph.D. thesis, Stanford 

University, Linguistics Department. 
[Joshil982] Joshi, Aravind K. 1982. Mutual beliefs in quest ion- answer systems. In Neil V. 

Smith, editor. Mutual Knowledge. Academic Press, New York, New York, pages 181-199. 
[Joshi and Kuhnl979] Joshi, Aravind K. and Steve Kuhn. 1979. Centered logic: The role of 

entity centered sentence representation in natural language inferencing. In Proc. 

International Joint Conference on Artificial Intelligence. 
[Joshi and Weinsteinl981] Joshi, Aravind K. and Scott Weinstein. 1981. Control of inference: 

Role of some aspects of discourse structure - centering. In Proc. International Joint 

Conference on Artificial Intelligence, pages 385-387. 
[Kameyamal985] Kameyama, Megumi. 1985. Zero anaphora: the case of Japanese. Ph.D. 

thesis, Stanford University. 
[Kameyamal986] Kameyama, Megumi. 1986. A property-sharing constraint in centering. In 

Proc. 24th Annual Meeting of the ACL, Association of Computational Linguistics, pages 

200-206, New York, NY. 
[Kameyamal988] Kameyama, Megumi. 1988. Japanese zero pronominal binding, where syntax 

and discourse meet. In William Poser, editor. Papers from the Second International 

Workshop on Japanese Syntax. Stanford: CSLI, pages 47-74. also available as University of 

Pennsylvania Tech Report MS-CIS-86-60. 
[Kunol973] Kuno, Susumo. 1973. The Structure of the Japanese Language. The MIT Press, 

Cambridge, Massachusetts. 
[Kunol972] Kuno, Susumu. 1972. Pronominalization, reflexivization, and direct discourse. 

Linguistic Inquiry, 3:161-195. 
[Kunol976a] Kuno, Susumu. 1976a. The speaker's empathy and its effects of syntax: A 

re-examination of yaru and kureru in Japanese. Journal of the Association of Teachers of 

Japanese, 11:249-271. 

[Kunol976b] Kuno, Susumu. 1976b. Subject, theme and speaker's empathy: A reexamination 
of relativization phenomena. In C. Li, editor. Subject and Topic. Academic Press, New York, 
pages 417-444. 

[Kunol987] Kuno, Susumu. 1987. Functional Syntax. Chicago University Press, Chicago, II, 
USA. 



36 



Marilyn Walker and Masayo lida and Sharon CoteJapanese Discourse and the Process of Centering 



[Kunol989] Kuno, Susumu. 1989. Identification of zero-pronominal reference in Japanese. 

Unpublished Manuscript, Presented at ATR workshop on Machine Translation. 
[Kuno and Kaburakil977] Kuno, Susumu and Etsuko Kaburaki. 1977. Empathy and syntax. 

Linguistic Inquiry, 8:627-672. 
[Kurodal965] Kuroda, Sigc-Yuki. 1965. Generative Grammatical Studies in the Japanese 

Language. Ph.D. thesis, MIT. 
[Martinl976] Martin, Sarnucl. 1976. A Reference Grammar of Japanese. Yale University Press, 

New Haven, Connecticut. 
[Nadathur and Joshil983] Nadathur, Gopalan and Aravind K. Joshi. 1983. Mutual beliefs in 

conversational systems: Their role in referring expressions. In Proceedings International 

Joint Conference on Artificial Intelligence, Austin, pages 603-605. 
[Nakagawal992] Nakagawa, Hiroshi. 1992. Zero pronouns as cxpcrienccr in Japanese discourse. 

In Fourteenth International Conference on Computational Linguistics, pages 324-330. 
[Nakashimal990] Nakashima, Hideyuki. 1990. Commonsense reasoning with multiply 

embedded contexts. In Pacific Rim Conference on Artificial Intelligence, pages 414-419. 
[Nakatanil993] Nakatani, Christine H. 1993. Accenting on pronouns in spontaneous narrative. 

In ESCA Workshop on Prosody. 
[Princel978a] Prince, Ellen F. 1978a. On the function of existential presupposition in 

discourse. In Papers from 14th Regional Meeting. CLS, Chicago, IL, pages 362-376. 
[Princel978b] Prince, Ellen F. 1978b. On the function of wh-clefts and it-clefts in discourse. 

Language, 54:883-906. 

[Princel981a] Prince, Ellen F. 1981a. Topicalization, focus movement and yiddish movement:a 

pragmatic differentiation. In D. Alford etal., editor. Proceedings of the Seventh Annual 

Meeting of the Berkeley Linguistics Society. BLS, pages 249-264. 
[Princel981b] Prince, Ellen F. 1981b. Toward a taxonomy of given-new information. In Radical 

Pragmatics. Academic Press, pages 223-255. 
[Princcl985] Prince, Ellen F. 1985. Fancy syntax and shared knowledge. Journal of 

Pragmatics, pages 65-81. 
[Princel986] Prince, Ellen F. 1986. On the syntactic marking of the presupposed open 

proposition. Proceedings of the 22nd Annual Meeting of the Chicago Linguistic Society. 
[Princcl992] Prince, Ellen F. 1992. The ZPG letter: Subjects, dcfinitcncss and information 

status. In S. Thompson and W. Mann, editors. Discourse description: diverse analyses of a 

fund raising text. John Benjamins B.V., Philadelphia/Amsterdam, pages 295-325. 
[Rambowl993] Rambow, Owen. 1993. Pragmatic aspects of scrambling and topicalization in 

german. In Institute for Research in Cognitive Science Workshop on Centering Theory in 

Naturally-Occurring Discourse. 
[Reinhartl976] Reinhart, Tanya. 1976. The Syntactic Domain of Anaphora. Ph.D. thesis, MIT, 

Cambridge Mass. 

[Reinhartl981] Reinhart, Tanya. 1981. Pragmatics and linguistics, an analysis of sentence 

topics. Philosophica, 27:53-94. Also 1982, lULC. 
[Robertsl995] Roberts, Craige. 1995. Salience, centering and anaphora resolution in discourse 

representation theory. In Marilyn A. Walker, Aravind K. Joshi, and Ellen F. Prince, editors. 

Centering in Discourse. Oxford University Press. 
[Shibatanil990] Shibatani, Masayoshi. 1990. The languages of Japan. Cambridge University 

Press. 

[Sidnerl979] Sidner, Candace L. 1979. Toward a computational theory of definite anaphora 

comprehension in English. Technical Report AI-TR-537, MIT. 
[Sidnerl981] Sidner, Candace L. 1981. Focusing for interpretation of pronouns. American 

Journal of Computational Linguistics, 7(4):217-231. 
[Sidnerl983] Sidner, Candace L. 1983. Focusing in the comprehension of definite anaphora. In 

M. Brady and R.C. Berwick, editors. Computational Models of Discourse. MIT Press. 
[Silvermanl987] Silverman, Kim. 1987. The Structure and Processing of Fundamental 

Frequency Contours. Ph.D. thesis, Cambridge University. 
[Swerts and Geluykensl992] Swerts, Marc and Ronald Geluykens. 1992. The prosodic 

structuring of information flow in spoken discourse. In Workshop on Prosody in Natural 

Speech, pages 221-230. Institute for Research in Cognitive Science, University of 

Pennsylvania, TR IRCS-92-37. 



37 



Marilyn Walker and Masayo lida and Sharon CoteJapanese Discourse and the Process of Centering 



[Terazu, Yamanasi, and Inadal980] Terazu, Noriko, Masaaki Yamanasi, and Toshiaki Inada. 
1980. Anaphora in Japanese. In Studies in English Lingusitics, volume 8. Tokyo, The Asahi 
Press, pages 32-52. 

[Terkenl995] Terken, J. M. B. 1995. Accessibility, prominence, pronouns and accents. In 

Marilyn A. Walker, Aravind K. Joshi, and Ellen F. Prince, editors. Centering in Discourse. 

Oxford University Press. 
[Turanl995] Turan, Umit Dcniz. 1995. Null vs. overt subjects in Turkish discourse: A 

Centering Analysis. Ph.D. thesis. University of Pennsylvania. 
[Walkerl989] Walker, Marilyn A. 1989. Evaluating discourse processing algorithms. In Proc. 

27th Annual Meeting of the Association of Computational Linguistics, pages 251-261. 
[Walkcrl992] Walker, Marilyn A. 1992. Redundancy in collaborative dialogue. In Fourteenth 

International Conference on Computational Linguistics, pages 345-351. 
[Walkerl993a] Walker, Marilyn A. 1993a. Informational Redundancy and Resource Bounds in 

Dialogue. Ph.D. thesis, University of Pennsylvania. 
[Walkerl993b] Walker, Marilyn A. 1993b. Initial contexts and shifting centers. In Institute for 

Research in Cognitive Science Workshop on Centering Theory in Naturally-Occurring 

Discourse. 

[Walker, lida, and Cotel990] Walker, Marilyn A., Masayo lida, and Sharon Cote. 1990. 
Centering in Japanese discourse. In COLING90: Proc. 13th International Conference on 

Computational Linguistics, Helsinki, pages 1-8. 
[Walker and Princein Press] Walker, Marilyn A. and Ellen F. Prince. In Press. A bilateral 
approach to givcnncss:a hearer-status algorithm and a centering algorithm. In Thorstein 
Fretheim and Jeanette Gundel, editors. Reference and Referent Accessibility. John 
Benjamins. 

[Walker and Whittakerl990] Walker, Marilyn A. and Steve Whittaker. 1990. Mixed initiative 

in dialogue: An investigation into discourse segmentation. In Proc. 28th Annual Meeting of 

the ACL, pages 70 79. 
[Wardl985] Ward, Gregory. 1985. The syntax and semantics of preposing. Ph.D. thesis. 

University of Pennsylvania. 
[Webberl978] Webber, Bonnie Lynn. 1978. A Formal Approach to Discourse Anaphora. Ph.D. 

thesis. Harvard University. Garland Press. 
[Whittaker and Stentonl988] Whittaker, Steve and Phil Stenton. 1988. Cues and control in 

expert client dialogues. In Proc. 26th Annual Meeting of the ACL, Association of 

Computational Linguistics, pages 123-130. 
[Yoshimotol988] Yoshimoto, Kei. 1988. Identifying zero pronouns in Japanese dialogue. In 

COLING88: Proc. 12th International Conference on Computational Linguistics, Budapest, 

Budapest, Hungary. 



38 



